GCP Cloud Data Architect
Cloud
Remote
Contract
About the job:
Title: GCP Cloud Data Architect
Start Date: Immediate
Position Type: Contract
Location: Remote across US/Canada
Title: GCP Cloud Data Architect
Start Date: Immediate
Position Type: Contract
Location: Remote across US/Canada
Key Responsibilities:
1. MDE Implementation & Configuration:
1. MDE Implementation & Configuration:
- Design, configure, and deploy Google Cloud's Manufacturing Data Engine solution tailored to our specific midstream operational landscape.
- Architect and build robust, scalable data pipelines using core MDE components like Pub/Sub, Dataflow, and Data Fusion to ingest real-time and batch data from OT sources (SCADA, DCS, PLCs, Historians like OSIsoft PI, IoT sensors) and IT systems (ERP, MES, LIMS, maintenance logs).
- Implement data processing logic within Dataflow for cleaning, transforming, and preparing data for analytics.
- Focus heavily on data contextualization, integrating time-series operational data with relevant IT data (e.g., asset information, work orders, material specs) within the MDE framework.
- Develop and implement strategies for mapping disparate data sources into a unified data model within BigQuery, leveraging MDE's capabilities.
- Work with operations and engineering teams to define the necessary context for meaningful analysis.
- Design and optimize data models in BigQuery specifically for storing and analyzing contextualized manufacturing/operational data, supporting time-series analysis, anomaly detection, and reporting.
- Manage data storage effectively within Cloud Storage and BigQuery, considering cost, performance, and data lifecycle requirements.
- Prepare and structure data to enable consumption by downstream analytics tools (Looker, other BI platforms) and AI/ML models (Vertex AI).
- Monitor the health, performance, and cost-efficiency of the MDE instance and associated data pipelines.
- Implement robust data quality checks, validation rules, and monitoring specific to operational data integrity within the MDE pipeline.
- Troubleshoot and resolve issues related to data ingestion, processing, storage, and accessibility within the MDE environment.
- Utilize Infrastructure as Code (IaC) tools (e.g., Terraform) and CI/CD practices for managing and deploying MDE configurations and data pipelines.
- Collaborate closely with process engineers, control systems specialists, maintenance teams, data scientists, and business analysts to understand requirements and translate them into MDE solutions.
- Ensure data security, access control (IAM), and compliance with industry regulations and internal policies within the GCP environment.
- Document the MDE architecture, data flows, contextualization logic, and operational procedures.
- Bachelor's degree in Computer Science, Engineering, Information Technology, or a related quantitative field.
- 3-5+ years of hands-on experience as a Data Engineer.
- Proven experience designing, building, and managing data pipelines on Google Cloud Platform (GCP).
- Strong proficiency with core GCP data services: BigQuery, Dataflow, Pub/Sub, Cloud Storage.
- Excellent programming skills in Python and SQL.
- Experience with ETL/ELT development for both streaming and batch processing.
- Understanding of data warehousing, data modeling (especially time-series and contextual models), and database concepts.
- Familiarity with software development best practices (Git, CI/CD, testing).
- Excellent analytical and problem-solving skills.
- Strong communication skills, capable of bridging the gap between technical teams and operational stakeholders.
- Direct experience implementing or working extensively with Google Cloud's Manufacturing Data Engine (MDE).
- Experience integrating data from industrial OT systems (SCADA, DCS, PLCs, OSIsoft PI Historian or similar).
- Familiarity with industrial communication protocols (e.g., OPC-UA, Modbus).
- Experience with data contextualization techniques, merging OT and IT data.
- Experience in the Midstream Oil & Gas sector or related heavy manufacturing/process industries.
- Experience with GCP Data Fusion, Cloud Composer (Airflow), or Vertex AI.
- Experience using Looker or other BI tools for visualizing operational data.
- Knowledge of IT systems like ERP (e.g., SAP, Oracle) or MES.
- Experience with Infrastructure as Code (Terraform).
- GCP Professional Data Engineer certification.