Anthropic · 13 hours ago
Software Engineer, Inference Deployment
Anthropic is a public benefit corporation focused on creating reliable and beneficial AI systems. The Software Engineer on the Launch Engineering team will design and build the deployment infrastructure for inference code, ensuring efficient and uninterrupted service across various hardware platforms.
Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
Responsibilities
Own deployment orchestration that continuously moves validated inference builds into production across GPU, TPU, and Trainium fleets, unattended under normal conditions
Improve capacity-aware deployment scheduling to maximize deployment throughput against constrained accelerator budgets and variable fleet sizes
Extend deployment observability — dashboards and tooling that answer "what code is running in production," "where is my commit," and "what validation passed for this deploy"
Drive down cycle time from code merge to production with pipeline architectures that minimize serial dependencies and maximize parallelism
Optimize fleet rollout strategies for large-scale deployments across thousands of GPU, TPU, and Trainium chips, minimizing disruption to serving capacity
Evolve self-service model onboarding so that new models can be added to the continuous deployment pipeline without Launch Engineering involvement
Partner across the Inference organization with teams owning validation, autoscaling, and model routing to integrate deployment automation with their systems
Qualification
Required
5+ years of experience building deployment, release, or delivery infrastructure at scale
Strong software engineering skills with experience designing systems that manage complex state machines and multi-stage pipelines
Experience with deployment systems where resource constraints shape the design — whether that's fleet capacity, network bandwidth, hardware availability, or coordinated rollout windows
A track record of building automation that measurably improves deployment velocity and reliability
Proficiency with Kubernetes-based deployments, rolling update mechanics, and container orchestration
Comfort working across the stack — from backend services and databases to CLI tools and web UIs
Strong communication skills and the ability to work closely with oncall engineers, model teams, and infrastructure partners
Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience
Preferred
Experience with ML inference or training infrastructure deployment, particularly across multiple accelerator types (GPU, TPU, Trainium)
Background in capacity planning or resource-constrained scheduling (e.g., bin-packing, fleet management, job scheduling with hardware affinity)
Experience with progressive delivery in systems with long validation cycles: canary/soak testing, blue-green deployments, traffic shifting, automated rollback
Experience at companies with large-scale release engineering challenges (mobile release trains, monorepo deployments, multi-datacenter rollouts)
Experience with Python and/or Rust in production systems
Benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
Company
Anthropic
Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.
H1B Sponsorship
Anthropic has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (105)
2024 (13)
2023 (3)
2022 (4)
2021 (1)
Funding
Current Stage
Late StageTotal Funding
$33.74BKey Investors
Fidelity,ICONIQ Capital,Lightspeed Venture PartnersLightspeed Venture PartnersGoogle
2025-09-02Series F· $13B
2025-05-16Debt Financing· $2.5B
2025-03-03Series E· $3.5B
Recent News
2026-02-05
Inc42 Media
2026-02-05
Company data provided by crunchbase