GEICO · 7 hours ago
Staff Software Engineer, AI Agent Platform
GEICO is a renowned insurance company that values innovation and aims to exceed customer expectations. They are seeking a Staff Software Engineer for their AI Agent Platform team to design and implement scalable systems for AI agent workflows, while also providing technical leadership and mentoring to the team.
Auto InsuranceFinancial ServicesGovernmentInsuranceInternetMobile
Responsibilities
Architect and implement scalable multi-tenant backend systems for building AI agent workflows, including agent configuration, offline evaluation, synthetic data generation, workflow simulation, agent marketplace, etc. using Azure Kubernetes Service (AKS), FastAPI, etc., ensuring economy of scale and control cost of maintenance
Collaborate with Design team to architect and implement frontend experiences and workflows for onboarding both technical and non-technical stakeholders, maximizing user adoption and successful AI agent development
Develop observability frameworks to ensure 99.9%+ uptime for AI agent platforms through robust monitoring, alerting, and incident response procedures
Evaluate and (if desirable) integrate cutting-edge GenAI frameworks, libraries and vendors to maintain a state-of-the-art technology stack, including hybrid cloud solutions with AWS/GCP as backup or specialized use cases
Architect and implement scalable, high-performance machine learning platforms and systems capable of processing large data volumes and supporting real-time decision making and workflows
Oversee the end-to-end lifecycle of AI agent applications, ensuring robust testing, deployment, and ongoing monitoring
Ensure adherence to company production readiness standards, security protocols, and regulatory compliance throughout the development lifecycle
Continuously optimize platform performance, reducing latency and improving throughput for AI agent workloads
Design and implement backup, recovery, and business continuity plans for hosted platform applications & services
Design and maintain robust CI/CD pipelines for ML model deployment using Azure DevOps, GitHub Actions, and MLOps tools
Act as the tech lead for a sub-team, setting technical direction and ensuring consistency in design principles and best practices
Provide hands-on mentorship and guidance during design reviews, code assessments, and performance tuning
Lead by example in tackling complex technical challenges and driving system-wide architectural improvements
Establish and champion engineering standards for ML infrastructure, deployment practices, and operational procedures
Create technical documentation, runbooks, and deliver internal training sessions on platform capabilities
Work closely with data scientists, software engineers, and product teams to seamlessly deploy ML systems into production environments
Translate complex technical concepts into actionable insights for both technical and non-technical stakeholders
Foster a collaborative environment that encourages innovation and the sharing of best practices across teams
Present technical solutions and platform roadmaps to leadership and cross-functional stakeholders
Qualification
Required
Bachelor's degree in computer science, Engineering, Mathematics, or a related field; an advanced degree (master's or Ph.D.) is highly desirable
6+ years of hands-on experience in designing, implementing, and maintaining multi-tenant AIML systems and platforms in production environments
6+ years of experience working with cloud platforms such as Azure and AWS
Extensive expertise in designing and deploying large-scale data pipelines and real-time inference systems and managing the end-to-end AI Agent and/or AIML system development lifecycles, including configuration, evaluation, monitoring, observability and AuthN/AuthR considerations
6+ years of experience working with common backend systems & tools (e.g, Kubernetes, Temporal, OpenSearch, PostgreSQL, Redis, Neo4J, etc.). Deep understanding of Docker, container optimization, and multi-stage builds. Experience with Prometheus, Grafana, Open Telemetry and distributed tracing
3+ years of experience building front-end web applications using frameworks such as React and/or Next.JS
Deep proficiency in programming languages such as Python, Java, Go, etc., with a strong emphasis on coding excellence. Extra credit for properly leveraging AI coding tools such as Cursor for productivity gains
Proficiency in AIML frameworks such as TensorFlow, PyTorch, Langraph, etc
Demonstrated track record of mentoring engineers and leading technical initiatives
Proven ability to tackle complex technical challenges, innovate through hands-on experimentation, and set technical standards
Excellent verbal and written communication against audience of diverse seniority levels and professional backgrounds
Preferred
Deep expertise operating and/or building AI agent platforms & capabilities like Langraph platform, Autogen, N8N, Crew.ai, etc
Experience with LLM observability systems such as Langsmith, Langfuse, Arize Phoenix, etc
Experience building LLM-based AI agent workflows via both no code/low code and traditional high-code development environments
Experience utilizing both open source (e.g. llama, Qwen, Mistral) and proprietary (e.g. GPT, Claude) LLMs for appropriate tasks
Understanding of AI safety principles, model governance, and regulatory compliance
Background in regulated industries with understanding of data privacy requirements and cybersecurity review processes
Benefits
Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being.
Financial benefits including market-competitive compensation; a 401K savings plan vested from day one that offers a 6% match; performance and recognition-based incentives; and tuition assistance.
Access to additional benefits like mental healthcare as well as fertility and adoption assistance.
Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year.
Company
GEICO
GEICO, Government Employees Insurance Company, has been providing affordable auto insurance since 1936. It is a sub-organization of Berkshire Hathaway.
Funding
Current Stage
Late StageTotal Funding
unknown1996-01-01Acquired
Leadership Team
Recent News
2026-01-24
Company data provided by crunchbase