Apply on Employer Site

Lambda · 11 hours ago

Senior Data Center Operations Engineer - Los Angeles, CA

Vernon, CA

Full-time

Onsite

Mid, Senior Level

$128K/yr - $170K/yr

3+ years exp

Lambda, The Superintelligence Cloud, is a leader in AI cloud infrastructure serving tens of thousands of customers. They are seeking a Senior Data Center Operations Engineer to ensure proper configuration and management of data center infrastructure, troubleshoot advanced systems, and collaborate with various teams to support large-scale deployments.

AI InfrastructureArtificial Intelligence (AI)Cloud ComputingData CenterGPUMachine Learning

Comp. & Benefits

H1B Sponsor Likely

Responsibilities

Ensure new server, storage and network infrastructure is properly racked, labeled, cabled, and configured

Troubleshoot hardware and software issues in some of the world’s most advanced GPU and Networking systems

Document and update data center layout and network topology in DCIM software

Work with supply chain & manufacturing teams to ensure timely deployment of systems and project plans for large-scale deployments

Manage a parts depot inventory and track equipment through the delivery-store-stage-deploy-handoff process in each of our data centers

Partner with HW Support teams to ensure data center hardware incidents with higher level troubleshooting challenges are resolved, reported on and solutions are disseminated to the large operations organization

Work with the RMA team to ensure faulty parts are returned and replacements are ordered

Follow installation standards and documentation for placement, labeling, and cabling to drive consistency and discoverability across all data centers

Improve installation standards, MOPs, and runbooks

Act as a technical escalation point for DC infrastructure issues

Participate in an on-call rotation, serving as an escalation point for data center incidents

Qualification

Data center infrastructurePower distribution systemsEnvironmental monitoringDCIM softwareNetwork fundamentalsCable managementLinux administrationHigh Performance ComputeProject managementSoft skills

Required

Ensure new server, storage and network infrastructure is properly racked, labeled, cabled, and configured

Troubleshoot hardware and software issues in some of the world's most advanced GPU and Networking systems

Document and update data center layout and network topology in DCIM software

Work with supply chain & manufacturing teams to ensure timely deployment of systems and project plans for large-scale deployments

Manage a parts depot inventory and track equipment through the delivery-store-stage-deploy-handoff process in each of our data centers

Work with the RMA team to ensure faulty parts are returned and replacements are ordered

Follow installation standards and documentation for placement, labeling, and cabling to drive consistency and discoverability across all data centers

Improve installation standards, MOPs, and runbooks

Act as a technical escalation point for DC infrastructure issues

Participate in an on-call rotation, serving as an escalation point for data center incidents

Have strong experience with critical infrastructure systems supporting data centers, such as power distribution, air flow management, environmental monitoring, capacity planning, DCIM software, structured cabling, and cable management

Are familiar with carrier DIA circuit test and turn ups, understanding LOA's, and fiber testing and troubleshooting

Have a solid understanding of cable, fiber, and optics and their different use cases

Solid understanding of single and three phase power theories including PDU balancing and why it is important

Base level network fundamentals (CCNA preferred but not required)

Knowledge of cold aisle and hot aisle containment

Solid understanding of server hardware and boot process (PXE, DHCP, & TFTP)

Work with product management, support, and other teams to align operational capabilities with company goals

Translating business priorities into technical and operational requirements

Supporting cross-functional projects where infrastructure plays a critical role

Are action-oriented and willing to train junior staff on best practices

Are willing to travel to bring up new data center locations as needed

Preferred

Have 3+ years experience with critical infrastructure systems supporting data centers, such as power distribution, air flow management, environmental monitoring, capacity planning, DCIM software, structured cabling, and cable management

Experience with/or knowledge of network topology and configurations and 400gb Infiniband architectures

Experience with project management

Have 3+ years working with and reporting from a ticketing systems like Service Now, JIRA, and Zendesk

Experience with Linux administration

Experience with High Performance Compute GPU systems (air or water cooled) - especially Nvidia NVL72

Experience with troubleshooting the following network layers, technologies, and system protocols: TCP/IP, DP/IP, BGP, OSPF, SNMP, SSL, HTTP, FTP, SSH, Syslog, DHCP, DNS, RDP, NETBIOS, IP routing, Ethernet, switched Ethernet, 802.11x, NFS, and VLANs

Benefits

Generous cash & equity compensation

Health, dental, and vision coverage for you and your dependents

Wellness and commuter stipends for select roles

401k Plan with 2% company match (USA employees)

Flexible paid time off plan that we all actually use

Company

Lambda

Lambda is a cloud-based platform that provides high-performance GPU hardware and cloud infrastructure for AI model training and inference.

Founded in 2012

San Jose, California, USA

501-1000 employees

https://lambda.ai

H1B Sponsorship

Lambda has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (16)

2024 (1)

2023 (3)

2022 (2)

2021 (2)

2020 (3)

Funding

Current Stage

Late Stage

Total Funding

$3.19B

Key Investors

TWG GlobalJP MorganMacquarie Group

2025-11-18Series E· $1.5B

2025-08-19Debt Financing· $275M

2025-02-19Series D· $480M

Leadership Team

Stephen Balaban

Co-founder, CEO

Michael Balaban

Co-Founder / CTO

Recent News

SiliconANGLE

AI cloud provider Lambda reportedly raising $350M round

2026-01-11

Business Wire

Lambda Appoints Leonard Speiser as Chief Operating Officer

2026-01-09

Techmeme

Source: Lambda, which rents access to AI chips and is backed by Nvidia, is in talks to raise $350M+ led by Mubadala Capital, ahead of an IPO planned for H2 2026 (The Information)

2026-01-09

Company data provided by crunchbase