Deployment Engineer, AI Inference jobs in United States
cer-icon
Apply on Employer Site
company-logo

Cerebras · 2 days ago

Deployment Engineer, AI Inference

Cerebras Systems builds the world's largest AI chip, focusing on providing industry-leading training and inference speeds for machine learning applications. The Deployment Engineer will be responsible for deploying and operating AI inference clusters, ensuring reliable and efficient deployment of workloads across their global infrastructure.

Artificial Intelligence (AI)ComputerHardwareSemiconductorSoftware
check
Growth Opportunities

Responsibilities

Deploy AI inference replicas and cluster software across multiple datacenters
Operate across heterogeneous datacenter environments undergoing rapid 10x growth
Maximize capacity allocation and optimize replica placement using constraint-solver algorithms
Operate bare-metal inference infrastructure while supporting transition to K8S-based platform
Develop and extend telemetry, observability and alerting solutions to ensure deployment reliability at scale
Develop and extend a fully automated deployment pipeline to support fast software updates and capacity reallocation at scale
Translate technical and customer needs into actionable requirements for the Dev Infra, Cluster, Platform and Core teams
Stay up to date with the latest advancements in AI compute infrastructure and related technologies

Qualification

PythonLinux systemsDockerKubernetesTelemetryOn-prem compute infrastructureObservabilitySpine-leaf networkingOwnership mindsetFast-paced environment

Required

2-5 years of experience in operating on-prem compute infrastructure (ideally in Machine Learning or High-Performance Compute) or developing and managing complex AWS plane infrastructure for hybrid deployments
Strong proficiency in Python for automation, orchestration, and deployment tooling
Solid understanding of Linux-based systems and command-line tools
Extensive knowledge of Docker containers and container orchestration platforms like K8S
Familiarity with spine-leaf (Clos) networking architecture
Proficiency with telemetry and observability stacks such as Prometheus, InfluxDB and Grafana
Strong ownership mindset and accountability for complex deployments
Ability to work effectively in a fast-paced environment

Company

Cerebras

twittertwittertwitter
company-logo
Cerebras Systems is the world's fastest AI inference. We are powering the future of generative AI.

Funding

Current Stage
Late Stage
Total Funding
$1.82B
Key Investors
Alpha Wave VenturesVy CapitalCoatue
2025-12-03Secondary Market
2025-09-30Series G· $1.1B
2024-09-27Series Unknown

Leadership Team

leader-logo
Andrew Feldman
Founder and CEO
linkedin
leader-logo
Bob Komin
Chief Financial Officer
linkedin
Company data provided by crunchbase