Senior AI Engineer - Platform
PhysicsX
About us
The Mission
We’re looking for a hands-on Senior AI Platform Engineer to build the foundations of how AI agents transform Engineering workflows across industries such as Manufacturing, Aerospace, and Semi-conductor. You'll be building the foundations that power next-generation simulation and design tools used by industry-leading engineering teams. Our platform allows Forward Deployed Engineers (FDEs) and customers to build and deploy deep learning surrogates that solve massive engineering challenges.
Your mission is to build and scale the Agentic stack within this wider ecosystem. You will implement a production-grade platform that enables our Product teams, FDEs, and customers to compose advanced AI workflows safely, transparently, and reliably.
- You will own delivery of key workstreams within our platform's critical infrastructure.
- You have practical experience running agents in production and understand the failure modes ("scar tissue").
- You are familiar with emerging open standards (such as MCP, A2A, and ACP) and have experience with patterns like durable execution and agent memory.
- You are passionate about building a world-class developer experience.
Core Responsibilities
You will own key workstreams within our Agentic ecosystem.
- Full-Stack Observability: Implement deep tracing and cost tracking to ensure visibility across the agent lifecycle.
- Deployment Patterns: Build tooling that simplifies auth, discoverability, and resource management for deployed agents.
- Sandboxing Implementation: Implement secure runtime environments that isolate agents and enforce safety policies.
- Systematic Evals: Build the feedback loops that allow domain experts to annotate traces and generate evaluation datasets.
- Enterprise Governance: Apply identity and access controls (RBAC, OIDC) to ensure agents act securely in a multi-tenant environment.
The Tech Stack
- Core Platform: Python (Primary), Go or TypeScript (Secondary), Kubernetes, Docker, Terraform.
- Agentic Infrastructure: LangGraph/LangChain, Temporal (Durable Execution), Vector DBs (Pinecone/Weaviate).
- Observability & Evals: OTel, LangSmith, Arize, Braintrust.
Who You Are
- A Systems Builder: You care about code quality, reliability, and maintainability. You prefer boring, working solutions over complex, fragile ones.
- Platform-First: You care deeply about the Developer Experience (DevEx). You build tools that help other engineers move faster.
- Security-Minded: You understand the risks of executing LLM-generated code and implementing proper safeguards.
Qualifications
- Platform & Backend Foundations:
- 3+ years of experience in Platform Engineering, Backend, or SRE.
- Strong proficiency in Python/Go, Kubernetes, Docker, and IaC (Terraform).
- Agentic & AI Engineering:
- Production experience designing Agentic architectures (chains, tools, memory).
- Familiarity with Agentic frameworks (LangGraph, PydanticAI) and patterns like durable execution.
- Understanding of LLM-specific lifecycle issues: non-determinism, systematic evals, and token-based cost tracking.
Bonus Points
- Background in Engineering workflows or simulation platforms.
- Experience building internal developer platforms (IDPs).