Description
The role of an AI Systems Developer/Engineer is an emerging and dynamic field that sits at the intersection of traditional software engineering and cutting-edge artificial intelligence. Unlike a data scientist who focuses on modeling, or a machine learning engineer who focuses on training models, this role is primarily concerned with integrating AI models (especially Large Language Models) into robust, scalable, and reliable production systems . This involves building the "brain" and the "nervous system" of an application—architecting how AI agents reason, access tools, and interact with users and data.
About the Role
We are seeking a talented and product-focused AI Systems Engineer to design, build, and maintain the core AI infrastructure that powers our [products/platforms/services]. In this role, you will move beyond basic chatbots and prototype notebooks to architect sophisticated, production-ready AI systems. You will be responsible for the end-to-end lifecycle of AI features—from orchestrating complex multi-agent workflows and implementing retrieval-augmented generation (RAG) to ensuring the safety, reliability, and observability of our AI layer .
You will collaborate closely with cross-functional teams, including product managers, software engineers, and domain experts, to translate business needs into intelligent, scalable, and trustworthy AI solutions .
Key Responsibilities
Architect & Orchestrate AI Agents: Design and maintain multi-agent systems, including the logic for agent selection, task routing, and hand-offs. Ensure agents operate within defined boundaries and can reliably use external tools and APIs .
Implement Retrieval-Augmented Generation (RAG): Own the pipeline for grounding AI responses in proprietary or domain-specific knowledge. This includes developing strategies for document ingestion, chunking, embedding, and vector search to ensure answers are accurate, contextual, and free from hallucination .
Develop & Integrate AI Services: Build robust backend services and APIs (e.g., with Python, FastAPI) to expose AI capabilities to front-end applications and internal platforms. Integrate with LLM providers (OpenAI, Anthropic, open-source models) and cloud AI services (AWS, GCP, Azure) .
Ensure Safety, Governance & Reliability: Implement guardrails, prompt management, and evaluation harnesses to control agent behavior, mitigate bias, and ensure compliance with company policies (e.g., PII handling, audit logs). Build monitoring and observability to track performance, cost, and system drift .
Optimize for Performance & Scale: Work on prompt optimization, caching strategies, and model selection to balance quality, latency, and cost. Deploy and manage AI workloads using containerization (Docker, Kubernetes) and CI/CD pipelines .
Cross-Functional Collaboration: Partner with product and design teams to create intuitive conversational and AI-augmented experiences. Work with data engineers to ensure access to clean, structured data .
Required Skills & Qualifications
Experience: Professional software engineering knowledge, with a proven track record of building and deploying production systems. Hands-on experience building applications with Large Language Models (LLMs) .
Programming: Expert-level proficiency in Python. Strong experience with backend frameworks like FastAPI, Django, or Flask .
AI & LLM Tech Stack:
Deep understanding of LLM concepts: prompt engineering, context windows, function/tool calling, and hallucination mitigation .
Experience with Agentic frameworks such as LangChain, LangGraph, AutoGen, or Semantic Kernel .
Practical experience designing and optimizing RAG pipelines using vector databases (e.g., Pinecone, Weaviate, Chroma, pgvector) and embedding models .
System Design & Architecture: Strong fundamentals in designing scalable, resilient, and observable distributed systems and APIs .
Cloud & DevOps: Experience deploying and managing applications on a major cloud provider (AWS, GCP, Azure). Familiarity with Docker, Kubernetes, and CI/CD principles .
Nice-to-Have / Preferred Qualifications
Experience in specific domains like conversational UX, health-tech, fin-tech, or manufacturing .
Familiarity with modern web development frameworks (e.g., React, Next.js) to build AI-assisted user interfaces .
Experience with MLOps practices, including model evaluation, A/B testing, and monitoring for LLMs .
Knowledge of high-performance inference and model optimization techniques (quantization, vLLM, TensorRT-LLM) .
Experience working in regulated industries with a strong focus on compliance and data security .
Success Profile
In your first few months, you will have successfully delivered an AI feature that measurably improves a user workflow or internal process. The systems you build will be characterized by predictable behavior, clear explainability, and robust error handling, earning the trust of both your teammates and your users