SIGN IN
Software Engineer, Inference Deployment jobs in United States
cer-icon
Apply on Employer Site
company-logo

Anthropic · 13 hours ago

Software Engineer, Inference Deployment

Anthropic is a public benefit corporation focused on creating reliable and beneficial AI systems. The Software Engineer on the Launch Engineering team will design and build the deployment infrastructure for inference code, ensuring efficient and uninterrupted service across various hardware platforms.
Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
check
H1B Sponsorednote

Responsibilities

Own deployment orchestration that continuously moves validated inference builds into production across GPU, TPU, and Trainium fleets, unattended under normal conditions
Improve capacity-aware deployment scheduling to maximize deployment throughput against constrained accelerator budgets and variable fleet sizes
Extend deployment observability — dashboards and tooling that answer "what code is running in production," "where is my commit," and "what validation passed for this deploy"
Drive down cycle time from code merge to production with pipeline architectures that minimize serial dependencies and maximize parallelism
Optimize fleet rollout strategies for large-scale deployments across thousands of GPU, TPU, and Trainium chips, minimizing disruption to serving capacity
Evolve self-service model onboarding so that new models can be added to the continuous deployment pipeline without Launch Engineering involvement
Partner across the Inference organization with teams owning validation, autoscaling, and model routing to integrate deployment automation with their systems

Qualification

Deployment infrastructureKubernetesAutomationResource managementPythonCommunication skills

Required

5+ years of experience building deployment, release, or delivery infrastructure at scale
Strong software engineering skills with experience designing systems that manage complex state machines and multi-stage pipelines
Experience with deployment systems where resource constraints shape the design — whether that's fleet capacity, network bandwidth, hardware availability, or coordinated rollout windows
A track record of building automation that measurably improves deployment velocity and reliability
Proficiency with Kubernetes-based deployments, rolling update mechanics, and container orchestration
Comfort working across the stack — from backend services and databases to CLI tools and web UIs
Strong communication skills and the ability to work closely with oncall engineers, model teams, and infrastructure partners
Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience

Preferred

Experience with ML inference or training infrastructure deployment, particularly across multiple accelerator types (GPU, TPU, Trainium)
Background in capacity planning or resource-constrained scheduling (e.g., bin-packing, fleet management, job scheduling with hardware affinity)
Experience with progressive delivery in systems with long validation cycles: canary/soak testing, blue-green deployments, traffic shifting, automated rollback
Experience at companies with large-scale release engineering challenges (mobile release trains, monorepo deployments, multi-datacenter rollouts)
Experience with Python and/or Rust in production systems

Benefits

Optional equity donation matching
Generous vacation and parental leave
Flexible working hours

Company

Anthropic

twittertwittertwitter
company-logo
Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.

H1B Sponsorship

Anthropic has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (105)
2024 (13)
2023 (3)
2022 (4)
2021 (1)

Funding

Current Stage
Late Stage
Total Funding
$33.74B
Key Investors
Fidelity,ICONIQ Capital,Lightspeed Venture PartnersLightspeed Venture PartnersGoogle
2025-09-02Series F· $13B
2025-05-16Debt Financing· $2.5B
2025-03-03Series E· $3.5B

Leadership Team

leader-logo
Dario Amodei
Co-Founder and CEO
linkedin
leader-logo
Daniela Amodei
President and co-founder
linkedin
Company data provided by crunchbase