Inferact · 14 hours ago
Member of Technical Staff, Cloud Orchestration
Inferact is focused on growing vLLM as the world's AI inference engine. They are seeking a cloud orchestration engineer to build the operational backbone for vLLM, ensuring reliable performance at scale through effective cluster management, deployment automation, and production monitoring.
Computer Software
Responsibilities
Design the systems for cluster management, deployment automation, and production monitoring
Ensure that vLLM deployments are observable, debuggable, and recoverable
Turn operational complexity into infrastructure that just works
Qualification
Required
Bachelor's degree or equivalent experience in computer science, engineering, or similar
Strong experience with Kubernetes and container orchestration at scale
Proficiency in Python/Rust/Go and infrastructure-as-code tools (Terraform, Helm, etc)
Experience managing GPU clusters and debugging hardware issues
Understanding of CI/CD pipelines and automation frameworks
Ability to work across cloud platforms (AWS, GCP, Azure) and on-premise infrastructure
Preferred
Experience with ML-specific orchestration tools (Ray, Slurm)
Knowledge of GPU scheduling, multi-tenancy, and resource optimization
Familiarity with vLLM deployment patterns and configuration
Track record of improving operational reliability for ML systems
Experience deploying inference systems on large-scale GPU clusters
Benefits
Generous health, dental, and vision benefits
401(k) company match
Company
Inferact
Inferact is a startup founded by creators and core maintainers of vLLM, the most popular open-source LLM inference engine.
Funding
Current Stage
Early StageCompany data provided by crunchbase