Lambda · 4 weeks ago
Forward Deployed Engineer (Site Reliability / Infrastructure)
Lambda, The Superintelligence Cloud, is a leader in AI cloud infrastructure serving tens of thousands of customers. They are seeking a Forward Deployed Engineer to embed directly with a strategic customer, serving as the technical bridge between Lambda and their team while delivering impactful solutions and optimizing infrastructure for AI/ML workloads.
AI InfrastructureArtificial Intelligence (AI)Cloud ComputingGPUMachine Learning
Responsibilities
Embed on-site with a named strategic customer, becoming an extension of their team
Act as the primary technical liaison between Lambda and the customer organization
Navigate ambiguous requirements to identify root problems and define clear technical solutions
Drive alignment across internal Lambda teams and customer stakeholders
Scope, sequence, and build full-stack solutions that deliver measurable business value
Design and implement infrastructure optimizations for AI/ML workloads at scale
Debug complex distributed systems issues across the infrastructure stack
Ship iteratively and learn fast, adjusting approach based on customer feedback and results
Identify reusable patterns from customer engagements that can scale across Lambda's customer base
Surface field intelligence that influences Lambda's product roadmap
Document and share learnings to elevate the capabilities of the broader team
Represent Lambda with executive presence in high-stakes customer interactions
Qualification
Required
6+ years of experience in a SRE, software engineer, or similar role, with a deep knowledge of running Linux clusters and systems
Strong programming skills in Go and Python; experience with GitOps (e.g., ArgoCD), Helm, and Kubernetes operators
Proven experience operating Kubernetes clusters in production environments (on-prem, EKS, GKE, or similar)
Hands-on experience with AI/ML workload management tools (Volcano, Kubeflow, or similar)
Can work either independently with limited direction or as part of a team
Familiarity with observability tools like Prometheus, Grafana, FluentBit, and CI/CD pipelines
Proven experience provisioning Kubernetes using tools such as kubeadm, Cluster API, or similar
Excellent communication skills with the ability to translate technical complexity for diverse audiences
Executive presence and ability to represent Lambda in customer-facing situations
Comfort operating in ambiguous environments with competing priorities
Strong bias for action and shipping iteratively
Preferred
Deep Kubernetes expertise: CRDs, CSI, CNI, Kubernetes Operator Coding experience
Exposure to HPC clusters, AI/ML workloads, or large-scale GPU clusters
Hybrid or multi-cloud Kubernetes environment experience
Contributions to CNCF projects or Kubernetes SIGs
Benefits
Health, dental, and vision coverage for you and your dependents
Wellness and commuter stipends for select roles
401k Plan with 2% company match (USA employees)
Flexible paid time off plan that we all actually use
Company
Lambda
Lambda is a cloud-based platform that provides high-performance GPU hardware and cloud infrastructure for AI model training and inference.
H1B Sponsorship
Lambda has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (16)
2024 (1)
2023 (3)
2022 (2)
2021 (2)
2020 (3)
Funding
Current Stage
Late StageTotal Funding
$3.19BKey Investors
TWG GlobalJP MorganMacquarie Group
2025-11-18Series E· $1.5B
2025-08-19Debt Financing· $275M
2025-02-19Series D· $480M
Recent News
2026-01-08
2025-12-25
2025-12-22
Company data provided by crunchbase