SIGN IN
Software Engineer - AI - Innovation Engineering jobs in United States
cer-icon
Apply on Employer Site
company-logo

Costco Wholesale · 11 hours ago

Software Engineer - AI - Innovation Engineering

Costco Wholesale is the third largest retailer in the world, known for its employee-centric culture and commitment to community service. The Software Engineer - AI will focus on the development and maintenance of the enterprise AI Platform as a Service, contributing to building scalable and modular AI solutions for the organization.
E-CommerceRetailSportsTicketingWholesale
check
Comp. & Benefits
badNo H1Bnote

Responsibilities

Develops the conceptual systems architecture design and the supporting technologies needed to enable new and/or enhanced functionality within a given product/application, applying principles that promote availability, reusability, interoperability, and security into the design framework
Identifies deficiencies within a product/application’s code base and identifies opportunities to improve overall code quality
Collaborates with team members (e.g., Systems Architects, Systems Analysts) to define project specifications and release documentation for all phases of the product development cycle from product definition to design, through implementation
Conducts peer code reviews for the software changes made by other Engineers within a team
Maintains and evolves core AI platform services, such as shared memory banks and session management. This includes managing the central reasoning engine
Architects and optimizes modular agentic framework templates using standardized orchestration SDKs to manage complex, stateful workflows
Creates and maintains a library of standardized capability servers using the model context protocol, enabling agents to securely orchestrate tasks across enterprise data platforms, customer relationship management systems, and distributed microservices
Integrates the AI Platform as a Service with existing enterprise infrastructure. This includes developing code for authentication systems, logging pipelines, and CI/CD workflows
Manages and optimizes the platform’s semantic retrieval architecture, ensuring high-speed, low-latency access to grounded enterprise knowledge through a unified discovery and retrieval engine
Engineers and optimizes discovery agents that leverage semantic retrieval and neural search platforms to ingest, validate, and cite evidence from unified multimodal lakehouses
Establishes Agent Identity (IAM) protocols to ensure every autonomous action is authenticated, authorized, and logged under a secure service principal
Implements and scales automated AI performance benchmarking systems to continuously monitor mission success rates and proactively identify regressions in reasoning, safety guardrails, or autonomous decision-making
Serves as the primary responder for systemic platform anomalies, utilizing distributed traceability, and aggregated telemetry to diagnose and resolve bottlenecks within complex multi-agent reasoning chains
Maintains internal documentation and SDKs that empower other software teams to onboard their use cases onto the AI Platform as a Service
Ensures the longevity, scalability, and quality of our systems through continuous improvement, comprehensive documentation, meticulous profiling, and significant performance enhancements

Qualification

API designMicroservicesAI Platform developmentAsynchronous orchestrationSemantic retrievalInfrastructure as CodeCI/CDContainer orchestrationDistributed systemsCommunicationProblem-solvingSelf-motivatedDetail-oriented

Required

5+ years years of experience in back-end software development with a focus on API design and microservices
5+ years of experience with API development, with an emphasis on security and performance
5+ years of experience with microservice-based debugging and performance testing
5+ years of experience developing within an agile methodology
1+ years of hands-on experience building with LLM APIs, function calling, or orchestration frameworks
Expertise experience architecting and maintaining containerized autonomous workloads across elastic container orchestration and scalable serverless runtimes, with a focus on high-availability event-driven architectures
Extensive experience managing high-scale semantic indices and engineered multimodal data pipelines that unify structured relational repositories with unstructured knowledge for agentic grounding
Expert-level proficiency in asynchronous orchestration frameworks utilizing type-safe schema validation and high-performance API gateways. Complementary mastery of statically typed systems for engineering highly concurrent, low-latency agentic back-end infrastructure
Expert mastery of stateful graph orchestration and multi-agent coordination frameworks, with a proven ability to design complex cyclic reasoning loops and automated task delegation systems
Proficiency in architecting semantic retrieval layers, attribute-aware discovery, and stateful persistence systems to provide high-fidelity long-term context for autonomous agents
Deep understanding of MCP, A2A, REST/gRPC APIs, Oauth2 security, and function calling mechanics
Experience with Infrastructure as Code and CI/CD for prompt engineering and model deployment
Experience leading technical workstreams, translating business problems into AI-native architectures
Strong verbal and written communication skills and be able to communicate to both technical and Business audiences
Ability to work under pressure in crisis with a strong sense of urgency
Responsible, conscientious, organized, self-motivated, and able to work with limited supervision
Detail-oriented and possess strong problem-solving skills and ability to analyze potential future issues
Able to support off-hours work as required, including weekends, holidays, and 24/7 on-call responsibilities on a rotational basis

Preferred

Bachelor's degree in Computer Science, Software Engineering, or a related technical field
Master's degree or PhD with a focus on Distributed Systems, AI Orchestration, or Machine Learning
Google Cloud Professional Data Engineer, Google Cloud Professional Cloud Architect, or any Agentic AI Specialty Certification focusing on Multi-Agent Systems and Autonomous Reasoning
3+ years distributed cache technologies
Experience with deploying and configuring Cloud Platform resources
Experience working in a retail ecommerce environment
Proficient in Google Workspace applications, including Sheets, Docs, Slides, and Gmail

Benefits

Paid time off
Health benefits - medical/dental/vision/hearing aid/pharmacy/behavioral health/employee assistance
Health care reimbursement account
Dependent care assistance plan
Short-term disability and long-term disability insurance
AD&D insurance
Life insurance
401(k)
Stock purchase plan to eligible employees

Company

Costco Wholesale

company-logo
Costco Wholesale is a multibillion dollar global retailer with warehouse club operations in 14 countries.

Funding

Current Stage
Public Company
Total Funding
unknown
1985-12-05IPO

Leadership Team

leader-logo
Ron Vachris
President & COO
R
Russ Miller
Senior Executive Vice President, COO - Warehouse Operations - U.S. & Mexico
Company data provided by crunchbase