Staff AI Engineer - Platform Team (Full Stack)

  • AI/ML

  • Remote

  • Permanent / Full Time

About the job:
Title: Staff AI Engineer – Platform Team (Full Stack)
Start Date: Immediate
Position Type: Contract/ Full Time
Location: Remote across Canada/ USA
  
Position Overview
We are seeking a Staff AI Engineer to design, build, and deploy advanced AI agents that automate complex infrastructure workflows for our internal developer platform. In this role, you will architect intelligent systems that integrate cutting-edge large language models (LLMs) with modern infrastructure tooling, enabling autonomous provisioning, cost optimization, and governance enforcement.
You will lead the development of AI agent systems from concept to production—delivering core services, orchestration pipelines, and developer tooling that reduce time-to-infrastructure and increase developer velocity. This role combines AI expertise, full-stack engineering, and platform architecture to build solutions that multiply engineering productivity while ensuring security and compliance at scale.
  
What You'll Bring

  • Passion for building production-grade AI systems with expertise in generative AI, autonomous agents, and retrieval-augmented generation (RAG).
  • Experience with AI agent frameworks such as LangChain, LangGraph, Semantic Kernel, Mastra, Llama Index, or OpenAI Agents SDK.
  • Full-stack skills: React + TypeScript for frontends, Node.js + Python for backends, and robust REST/gRPC APIs.
  • Familiarity with enterprise systems like OIDC, RBAC, PostgreSQL, Redis, and multi-tenant architectures.
  • Strong in systems design and understanding the first principles of Software Development and Engineering Pattern.
  
Key Responsibilities
  
AI Platform Development
  • Build reusable AI services, APIs, and orchestration pipelines (LangGraph, Airflow, Temporal) for cross-team initiatives.
  • Design runtime environments for multi-step, autonomous infrastructure workflows.
  • Architect deployment pipelines with A/B testing, versioning, and rollback for production AI agents.
  
Developer Enablement
  • Deliver SDKs, tooling, and documentation so product teams can integrate AI workflows seamlessly.
  • Provide reference implementations and best practices for AI integration across multiple product lines.
  • Build developer-friendly abstractions for complex AI orchestration.
  
Orchestration & Observability
  • Implement durable job processing with Temporal, Airflow, or similar workflow systems.
  • Develop monitoring and observability frameworks for AI agent performance, reliability, and decision quality.
  • Design intelligent error handling and recovery mechanisms for AI-driven operations.
  
Full-Stack Engineering
  • Build scalable Node.js APIs to integrate AI agents with traditional infrastructure tools.
  • Create React frontends for conversational AI interfaces and real-time dashboards.
  • Enable real-time visualization of AI agent decisions and infrastructure state changes.
  
Architecture Leadership
  • Drive technical standards for AI agent development, deployment, and integration patterns.
  • Partner with product and platform leads to ensure scalability, reliability, and security across the stack.
  • Influence system design for performance, maintainability, and enterprise consistency.
  
Advanced Agent Development
  • Design multi-agent architectures with LangGraph, Pydantic AI, Mastra, and OpenAI Agents SDK.
  • Integrate agents with infrastructure tools like GitLab, Terraform, Ansible, and OpenStack using MCP tools.
  • Manage vector databases and knowledge retrieval systems for context-aware AI decision-making.
  • Conduct continuous R&D with the latest LLMs, frameworks, and orchestration tools.
  
Tech Stack
Languages & Architecture:
  • TypeScript, Python, Node.js, React, microservices, REST/gRPC APIs
AI & Agent Frameworks
  • LangGraph, LlamaIndex, Semantic Kernel, Mastra, OpenAI Agents SDK, Pydantic AI
Databases & Orchestration
  • Postgres, Temporal, Airflow, DAGs, Workflows.
Infrastructure & Platform Tools
  • GitLab, Terraform, Ansible, OpenStack, Kubernetes, Docker, AWS CDK
Data & Enterprise Systems
  • Redis, Microsoft SSO, RBAC, multi-tenant architectures
  

Main Logo
Rocket