Apply on Employer Site

Cerebras · 1 month ago

Deployment Engineer, AI Inference

Sunnyvale, CA

Full-time

Onsite

Entry, Mid Level

2+ years exp

Cerebras Systems builds the world's largest AI chip, providing unparalleled AI compute power. They are seeking a highly skilled Deployment Engineer to build and operate cutting-edge inference clusters, ensuring reliable and efficient deployment of AI workloads across their global infrastructure.

AI InfrastructureArtificial Intelligence (AI)ComputerHardwareSemiconductorSoftware

Growth Opportunities

H1B Sponsor Likely

Responsibilities

Deploy AI inference replicas and cluster software across multiple datacenters

Operate across heterogeneous datacenter environments undergoing rapid 10x growth

Maximize capacity allocation and optimize replica placement using constraint-solver algorithms

Operate bare-metal inference infrastructure while supporting transition to K8S-based platform

Develop and extend telemetry, observability and alerting solutions to ensure deployment reliability at scale

Develop and extend a fully automated deployment pipeline to support fast software updates and capacity reallocation at scale

Translate technical and customer needs into actionable requirements for the Dev Infra, Cluster, Platform and Core teams

Stay up to date with the latest advancements in AI compute infrastructure and related technologies

Qualification

PythonLinux systemsDockerKubernetesTelemetryAWS infrastructureObservabilityNetworking architectureAutomationOrchestrationFast-paced environmentOwnership mindset

Required

2-5 years of experience in operating on-prem compute infrastructure (ideally in Machine Learning or High-Performance Compute) or developing and managing complex AWS plane infrastructure for hybrid deployments

Strong proficiency in Python for automation, orchestration, and deployment tooling

Solid understanding of Linux-based systems and command-line tools

Extensive knowledge of Docker containers and container orchestration platforms like K8S

Familiarity with spine-leaf (Clos) networking architecture

Proficiency with telemetry and observability stacks such as Prometheus, InfluxDB and Grafana

Strong ownership mindset and accountability for complex deployments

Ability to work effectively in a fast-paced environment

Company

Cerebras

Cerebras Systems is the world's fastest AI inference. We are powering the future of generative AI.

Founded in 2016

Sunnyvale, California, USA

501-1000 employees

https://cerebras.ai

H1B Sponsorship

Cerebras has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (31)

2024 (16)

2023 (18)

2022 (17)

2021 (34)

2020 (23)

Funding

Current Stage

Late Stage

Total Funding

$1.82B

Key Investors

Alpha Wave VenturesVy CapitalCoatue

2025-12-03Secondary Market

2025-09-30Series G· $1.1B

2024-09-27Series Unknown

Leadership Team

Andrew Feldman

CEO & Founder

Bob Komin

Chief Financial Officer

Recent News

Crunchbase News

Sector Snapshot: US Semiconductor Startup Funding Hits Record High

2026-01-06

Benzinga.com

Who's Going Public Next? Kalshi Bets Drop US IPO Clues Before 2027— And It's Not Just SpaceX Or OpenAI

2026-01-03

Foundation Capital

Foundation Capital Portfolio

2026-01-02

Company data provided by crunchbase