Principal AI Software Engineer, Enterprise AI Platform
Job Description
Role Overview
The Enterprise AI Platform Engineer is responsible for build and delivery of Natera’s enterprise agentic AI platform. The enterprise AI platform will be used to prototype and build multiple agentic AI solutions using low code across Natera in a federated approach.This is a hands-on technical leadership role at the intersection of engineering excellence, low code platform design, and applied GenAI engineering.
You will architect and build the core AI operating system that powers modular, low-code enterprise AI agentic automation that is complete with agent templates, agent orchestration engine, data and MCP connectors, prompt optimization capabilities, evaluation guardrails, abstracted AI services, intelligent data extraction and reasoning capabilities. The platform empowers citizen developers, business analysts, and engineers to prototype and test AI-powered workflows using a low-code interface, while enabling developers to extend functionality through a pro-code framework.
Key Responsibilities
Platform Architecture & Core Infrastructure
Design: Design and implement the core architecture of the Enterprise AI Platform — low code, modular, scalable, and secure.
Agentic Orchestration: Build the agent orchestration runtime, including task queues, state management, and inter-agent communication.
Complexity: Architect for long-running, resilient AI workflows, enabling agents to execute and monitor multi-step, asynchronous processes.
Low code abstraction: Develop APIs and services for automation, evaluation, and agent lifecycle management.
Deployment: Establish DevOps, CI/CD pipelines, and configuration management to ensure smooth deployment at scale.
Low-Code/Pro-Code Experience
Low-Code Interface: Build an intuitive visual builder that allows business users to compose agent workflows through drag-and-drop and configuration.
Pro-Code Mode: Provide a developer extension layer where engineers can author and deploy agents in code (Python, TypeScript) directly into the same framework.
Unified Runtime: Ensure both low-code and pro-code workflows share common infrastructure for orchestration, evaluation, and governance.
Transparency & Debugging: Surface workflow traces, model evaluations, and output explanations directly in the user interface.
Experimentation & Versioning: Support iterative experimentation, evaluation-based comparison, and rollback through integrated version control.
Agentic Orchestration & Long-Running Agents
Orchestration Engine: Build a robust orchestration system supporting both short-lived agent calls and long-running AI agents that persist over time to automate complex processes.
Workflow Automation: Enable orchestration of multiple agents with shared state, scheduling, dependency resolution, and event-driven execution.
Enterprise Integration: Connect agents to core enterprise systems to perform real-world actions securely.
Autonomy & Resilience: Implement mechanisms for persistence, checkpointing, recovery, and human-in-the-loop interventions.
Human in the Loop Feedback: Design human-in-the-loop and self-assessment mechanisms for continuous prompt and workflow improvement.
Evaluation as a Core Layer: Architect an evaluation-first framework for monitoring and improving AI agent performance across all workflows.
AI Services & Capabilities
MCP Integration: Integrate with Model Context Protocols (MCPs) to enable plug-and-play connectivity with external systems and actions.
Retrieval-Augmented Generation (RAG): Build services to retrieve information from unstructured data using vector databases and retrieval pipelines.
Prompt Optimization & Evaluation: Implement automated systems for prompt tuning, evaluation, and feedback loops to ensure reliable results.
Role Overview
The Enterprise AI Platform Engineer is responsible for build and delivery of Natera’s enterprise agentic AI platform. The enterprise AI platform will be used to prototype and build multiple agentic AI solutions using low code across Natera in a federated approach.This is a hands-on technical leadership role at the intersection of engineering excellence, low code platform design, and applied GenAI engineering.
You will architect and build the core AI operating system that powers modular, low-code enterprise AI agentic automation that is complete with agent templates, agent orchestration engine, data and MCP connectors, prompt optimization capabilities, evaluation guardrails, abstracted AI services, intelligent data extraction and reasoning capabilities. The platform empowers citizen developers, business analysts, and engineers to prototype and test AI-powered workflows using a low-code interface, while enabling developers to extend functionality through a pro-code framework.
Key Responsibilities
Platform Architecture & Core Infrastructure
Design: Design and implement the core architecture of the Enterprise AI Platform — low code, modular, scalable, and secure.
Agentic Orchestration: Build the agent orchestration runtime, including task queues, state management, and inter-agent communication.
Complexity: Architect for long-running, resilient AI workflows, enabling agents to execute and monitor multi-step, asynchronous processes.
Low code abstraction: Develop APIs and services for automation, evaluation, and agent lifecycle management.
Deployment: Establish DevOps, CI/CD pipelines, and configuration management to ensure smooth deployment at scale.
Low-Code/Pro-Code Experience
Low-Code Interface: Build an intuitive visual builder that allows business users to compose agent workflows through drag-and-drop and configuration.
Pro-Code Mode: Provide a developer extension layer where engineers can author and deploy agents in code (Python, TypeScript) directly into the same framework.
Unified Runtime: Ensure both low-code and pro-code workflows share common infrastructure for orchestration, evaluation, and governance.
Transparency & Debugging: Surface workflow traces, model evaluations, and output explanations directly in the user interface.
Experimentation & Versioning: Support iterative experimentation, evaluation-based comparison, and rollback through integrated version control.
Agentic Orchestration & Long-Running Agents
Orchestration Engine: Build a robust orchestration system supporting both short-lived agent calls and long-running AI agents that persist over time to automate complex processes.
Workflow Automation: Enable orchestration of multiple agents with shared state, scheduling, dependency resolution, and event-driven execution.
Enterprise Integration: Connect agents to core enterprise systems to perform real-world actions securely.
Autonomy & Resilience: Implement mechanisms for persistence, checkpointing, recovery, and human-in-the-loop interventions.
Human in the Loop Feedback: Design human-in-the-loop and self-assessment mechanisms for continuous prompt and workflow improvement.
Evaluation as a Core Layer: Architect an evaluation-first framework for monitoring and improving AI agent performance across all workflows.
AI Services & Capabilities
MCP Integration: Integrate with Model Context Protocols (MCPs) to enable plug-and-play connectivity with external systems and actions.
Retrieval-Augmented Generation (RAG): Build services to retrieve information from unstructured data using vector databases and retrieval pipelines.
Prompt Optimization & Evaluation: Implement automated systems for prompt tuning, evaluation, and feedback loops to ensure reliable results.
Abstracted AI Services: Build modular APIs for AI services such as unstructured document processing, information retrieval, information summarization, data extraction, content generation, classification etc.
Evaluation as a Core Layer: Architect an evaluation-first framework for monitoring and improving AI agent performance across all workflows.
Reusable Components: Create shared, composable AI primitives (e.g., document loaders, semantic routers, extractors) to accelerate workflow design.
Governance, Security & Observability
Enforce governance, security, and compliance principles (SOC2, HIPAA, GDPR) across all platform operations.
Implement RBAC, audit logging, and lineage tracking for all data and agent interactions.
Build observability tools for tracing, cost monitoring, and system performance metrics.
Integrate evaluation-based guardrails that detect hallucinations, bias, or policy violations in real time.
Metric Tracking: Create structured metrics dashboards (precision, recall, task success rate, cost efficiency) for every deployed agent.
Technical Leadership & Collaboration
Establish technical standards, documentation, and engineering patterns for future platform development.
Collaborate with business stakeholders, data scientists, and product teams to identify automation use cases and measure ROI via evaluation metrics.
Mentor future engineers and contribute to an engineering culture centered on safety, transparency, and impact.
Continuously explore emerging agent frameworks, vector stores, and evaluation methodologies.
Qualifications
Required
12+ years of software engineering experience, with 8+ years in platform or distributed systems architecture.
Proven expertise in implementing workflow orchestration or automation systems
Proficiency in Python with deep experience in backend architecture and API design.
Experience with low-code/no-code automation platforms (Zapier, n8n etc.), internal developer platforms (IDPs), or workflow engines (Temporal, Airflow).
Experience in working with well known agentic cloud platforms (e.g. AWS Bedrock agents, AgentCore etc.).
Hands-on experience with LLMs, RAG, vector databases, and orchestration frameworks (LangChain, LlamaIndex, AutoGen, DSPy).
Fluency in cloud infrastructure, Kubernetes, Docker, and CI/CD automation.
Knowledge of observability and telemetry systems for event-driven environments.
Get Similar Jobs in Your Inbox
Weekly digest of top bioinformatics jobs. No spam.