Cerebras · 2 days ago
Deployment Engineer, AI Inference
Cerebras Systems builds the world's largest AI chip, focusing on providing industry-leading training and inference speeds for machine learning applications. The Deployment Engineer will be responsible for deploying and operating AI inference clusters, ensuring reliable and efficient deployment of workloads across their global infrastructure.
Artificial Intelligence (AI)ComputerHardwareSemiconductorSoftware
Responsibilities
Deploy AI inference replicas and cluster software across multiple datacenters
Operate across heterogeneous datacenter environments undergoing rapid 10x growth
Maximize capacity allocation and optimize replica placement using constraint-solver algorithms
Operate bare-metal inference infrastructure while supporting transition to K8S-based platform
Develop and extend telemetry, observability and alerting solutions to ensure deployment reliability at scale
Develop and extend a fully automated deployment pipeline to support fast software updates and capacity reallocation at scale
Translate technical and customer needs into actionable requirements for the Dev Infra, Cluster, Platform and Core teams
Stay up to date with the latest advancements in AI compute infrastructure and related technologies
Qualification
Required
2-5 years of experience in operating on-prem compute infrastructure (ideally in Machine Learning or High-Performance Compute) or developing and managing complex AWS plane infrastructure for hybrid deployments
Strong proficiency in Python for automation, orchestration, and deployment tooling
Solid understanding of Linux-based systems and command-line tools
Extensive knowledge of Docker containers and container orchestration platforms like K8S
Familiarity with spine-leaf (Clos) networking architecture
Proficiency with telemetry and observability stacks such as Prometheus, InfluxDB and Grafana
Strong ownership mindset and accountability for complex deployments
Ability to work effectively in a fast-paced environment
Company
Cerebras
Cerebras Systems is the world's fastest AI inference. We are powering the future of generative AI.
Funding
Current Stage
Late StageTotal Funding
$1.82BKey Investors
Alpha Wave VenturesVy CapitalCoatue
2025-12-03Secondary Market
2025-09-30Series G· $1.1B
2024-09-27Series Unknown
Recent News
globalventuring.com
2025-12-27
Crunchbase News
2025-12-26
2025-12-26
Company data provided by crunchbase