[Remote] Senior AI Platform Engineer - Sr. Director Level - ZL
Note: The job is a remote job and is open to candidates in USA. Dice is a leading company in the tech industry, and they are seeking a Senior AI Platform Engineer at the Sr. Director Level. The role involves leading the development and operation of AI/ML systems, focusing on LLM applications and cloud AI services, while providing technical leadership and mentorship.
Skills
- + years of software/platform engineering experience, including 2+ years building and operating LLM-based or AI/ML systems in production
- Strong programming skills in Python
- Hands-on experience building agentic systems or LLM applications: orchestration, tool use, RAG, prompt engineering, and evaluation
- Cloud AI services: hands-on experience building and deploying with managed AI services on AWS and/or Azure (e.g., managed model, agent, and search/retrieval offerings), with the judgment to design cloud-agnostic where it matters
- Multi-cloud flexibility: comfortable working across cloud platforms, primarily AWS and Azure, with Oracle Cloud (OCI) or Google Cloud Platform used as needed, and able to avoid hard vendor lock-in in platform design
- Deep production/LLMOps/AgentOps experience: CI/CD, containerization (Docker), orchestration (Kubernetes), and infrastructure-as-code (Terraform, Bicep, or equivalent)
- Experience operating production systems: on-call/SRE practices, incident response, and reliability engineering against SLAs/SLOs
- Demonstrated technical leadership and mentorship at a senior/staff level
- Proven track record with observability and reliability: logging, distributed tracing, metrics, alerting, and SLAs/SLOs for live systems
- Experience designing for governance, auditability, security, or compliance in regulated or enterprise environments
- Strong API and systems design skills; comfort with distributed, scalable architectures and event-driven systems
- Excellent communication and the ability to explain complex technical and risk trade-offs to technical and non-technical audiences
- Bachelor's or Master's in Computer Science, Engineering, or a related field, or equivalent practical experience
- Cloud AI/ML platforms: depth in the AWS and/or Azure AI stacks: managed agent, model-catalog, evaluation, and retrieval/search services, plus ML platforms, API gateways, and cloud identity for governance
- Experience with agent frameworks and tooling (e.g., LangChain/LangGraph, LlamaIndex, Semantic Kernel, Model Context Protocol, OpenAI/Anthropic SDKs)
- Vector databases and embeddings (e.g., Pinecone, Weaviate, FAISS, pgvector, Azure AI Search) and retrieval system design
- LLM evaluation/observability tooling (e.g., LangSmith, Arize, Langfuse, OpenTelemetry for LLMs)
- Familiarity with AI governance frameworks and standards (e.g., NIST AI RMF, ISO/IEC 42001, SOC 2 in an AI context)
- Federal/government regulatory experience: working in or alongside U.S. federal compliance regimes (e.g., NIST 800-53, NIST 800-171, CMMC, FedRAMP, FISMA), including Azure Government or other government-cloud environments
- Experience with fine-tuning, model routing, cost optimization, or self-hosted/open-weight model deployment
Company Overview
Company H1B Sponsorship