Sr Staff Software Engineer- AI jobs in United States
cer-icon
Apply on Employer Site
company-logo

GEICO · 1 week ago

Sr Staff Software Engineer- AI

GEICO is seeking an experienced Engineer with a passion for building high-performance, low maintenance, zero-downtime platforms, and applications. This role focuses on designing and deploying machine learning systems for intelligent incident detection and automated root cause analysis, leading the technical strategy for an AI-powered incident response system.

Auto InsuranceFinancial ServicesGovernmentInsuranceInternetMobile
check
H1B Sponsorednote

Responsibilities

Design and build a multi-agent AI platform where specialized agents autonomously detect, diagnose, and resolve issues through agent-to-agent (A2A) collaboration
Develop intelligent agents using LLMs and agentic frameworks that coordinate detection, diagnostic, remediation, and knowledge tasks with minimal human intervention
Define agent interaction protocols, A2A communication standards, and evaluation frameworks for agent decision quality and autonomous action safety
Architect vector database solutions (Milvus, pgvector, Qdrant) for semantic search and RAG to enable context-aware agent decision-making
Build end-to-end ML pipelines for severity classification, anomaly detection, failure pattern recognition, and impact forecasting using observability data
Establish scalable orchestration infrastructure for multi-agent workflows with CI/CD, automated evaluation, canary releases, and rollback strategies
Implement monitoring for agent interactions, A2A communication patterns, decision quality, data drift, and system reliability
Lead technical architecture ensuring scalability, observability, and integration with existing alerting, logging, and monitoring systems
Define standards for agent safety, explainability, governance, and human-in-the-loop controls for high-impact automated actions
Partner with SRE, Product, and Engineering teams to translate reliability goals into measurable ML objectives and maintain pragmatic technical roadmaps
Mentor engineers through complex AI platform implementations and establish best practices, coding standards, and technical documentation
Stay current with AI/ML and multi-agent systems; educate engineering leadership on emerging technologies

Qualification

Machine Learning SystemsMulti-Agent AISite Reliability EngineeringObservability ToolsPythonCloud ProvidersEnd-to-End ML LifecycleVector DatabasesData WarehousingProblem SolvingMentoringTechnical Documentation

Required

Experience building and deploying ML systems in production with cross-functional engineering teams
Fluency in at least two modern languages such as Python, Go, Java, C++, or C# including object-oriented design
Experience architecting multi-component ML platforms using open-source/cloud-agnostic components: Datastores: PostgreSQL, NoSQL (MongoDB, Cassandra, CosmosDB)
Streaming: Kafka, Flink, or Spark Streaming
Experience with end-to-end ML lifecycle: version control, CI/CD, Kubernetes, testing, monitoring, and production support
Experience with cloud providers (Azure, AWS or GCP) in production ML environments
Experience with observability tools and distributed systems monitoring, logging, tracing, and root cause analysis
Experience building multi-agent systems using LLMs and agentic frameworks (e.g., LangChain, LangGraph, AutoGen, Semantic Kernel, CrewAI)
Hands-on experience with RAG, semantic search, and vector databases (e.g., Milvus, pgvector, Qdrant, ElasticSearch)
Experience designing human-in-the-loop workflows and safety controls for autonomous systems
Strong architecture and design skills with ability to influence technical direction and roadmap
Proven ability to solve complex problems with data-driven approaches
Experience fine-tuning or deploying open-source LLMs (Llama, Mistral, Phi) is a plus
Experience with data warehouse/lakehouse platforms (e.g., Snowflake, Databricks, Parquet, Delta, Iceberg)
10+ years of professional platform development or general development experience
8+ years of experience with architecture and design
6+ years of experience building and deploying machine learning systems in production
6+ years of experience in open-source frameworks
4+ years of experience with AWS, GCP, Azure, or another cloud service
2+ years of experience with LLMs, agentic AI frameworks, or multi-agent systems
Bachelor's degree in Computer Science, Information Systems, or equivalent education or work experience

Benefits

Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being.
Financial benefits including market-competitive compensation; a 401K savings plan vested from day one that offers a 6% match; performance and recognition-based incentives; and tuition assistance.
Access to additional benefits like mental healthcare as well as fertility and adoption assistance.
Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year.

Company

GEICO, Government Employees Insurance Company, has been providing affordable auto insurance since 1936. It is a sub-organization of Berkshire Hathaway.

H1B Sponsorship

GEICO has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (128)
2024 (277)
2023 (338)
2022 (212)
2021 (148)
2020 (205)

Funding

Current Stage
Late Stage
Total Funding
unknown
1996-01-01Acquired

Leadership Team

leader-logo
Todd Combs
Chairman, President, and Chief Executive Officer
leader-logo
Clayton Johnson
Sr. Director of Product Management
linkedin
Company data provided by crunchbase