Member of Technical Staff, Infrastructure Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Odyssey · 1 month ago

Member of Technical Staff, Infrastructure Engineer

Odyssey is an AI lab pioneering general-purpose world models that will power the next generation of various applications. They are seeking an Infrastructure Engineer to develop and operate their low-latency model inference platform, scale core data processing infrastructure, and design GPU-based training clusters for deep learning.

Artificial Intelligence (AI)Graphic DesignVirtual Reality

Responsibilities

Develop and operate our low-latency model inference platform, ensuring high availability, scaling, and efficient resource utilization for Odyssey’s products
Engineer and scale our core data processing infrastructure (e.g., Flyte, Ray with k8s) to handle petabyte-scale datasets
Design, build, and maintain our large-scale, GPU-based training clusters for deep learning, focusing on high throughput and reliability
Automate infrastructure provisioning, configuration, monitoring, and alerting using Infrastructure as Code (IaC) principles
Drive performance tuning, cost optimization, and reliability improvements across the entire stack
Collaborate closely with researchers and product developers to understand their requirements, optimize their workflows, and improve platform usability

Qualification

PythonKubernetesInfrastructure as CodeGPU-based training clustersDistributed systemsContainerizationCollaborationCommunication skills

Required

Develop and operate our low-latency model inference platform, ensuring high availability, scaling, and efficient resource utilization for Odyssey's products
Engineer and scale our core data processing infrastructure (e.g., Flyte, Ray with k8s) to handle petabyte-scale datasets
Design, build, and maintain our large-scale, GPU-based training clusters for deep learning, focusing on high throughput and reliability
Automate infrastructure provisioning, configuration, monitoring, and alerting using Infrastructure as Code (IaC) principles
Drive performance tuning, cost optimization, and reliability improvements across the entire stack
Collaborate closely with researchers and product developers to understand their requirements, optimize their workflows, and improve platform usability
Strong programming skills (e.g., Python, Go, or similar) and a solid understanding of software engineering best practices
Deep, hands-on experience with containerization (e.g., Docker), container orchestration (Kubernetes) and Infrastructure as Code (Terraform)
Proven experience building and managing large-scale, distributed systems with GPU computational workloads (e.g., compute platforms, data pipelines, or high-availability services)
Experienced in designing infrastructure for ML workloads where performance, parallelism, and data movement are critical
A collaborative mindset and excellent communication skills, with a passion for building developer-friendly platforms
Motivated by building for the frontier: you want to shape the compute and infrastructure foundation of a lab redefining how people create and interact with media

Company

Odyssey

twittertwittertwitter
company-logo
Odyssey is an AI platform that enables storytellers to create cinematic video content and editable 3D scenes using visual AI technology.

Funding

Current Stage
Early Stage
Total Funding
$27M
Key Investors
EQT VenturesGoogle Ventures
2024-11-13Series A· $18M
2024-07-12Seed· $9M

Leadership Team

leader-logo
Oliver Cameron
Co-Founder & CEO
linkedin
leader-logo
Jeff Hawke
Co-founder & CTO
linkedin
Company data provided by crunchbase