Lambda · 1 month ago
Data Center Operations Systems Engineer - Atlanta
Lambda is a leader in AI cloud infrastructure serving a diverse range of customers, from AI researchers to large enterprises. The Data Center Operations Systems Engineer will ensure optimal performance and uptime of AI-IaaS infrastructure by managing end-to-end data center operations, including deployment and operational efficiency.
AI InfrastructureArtificial Intelligence (AI)Cloud ComputingData CenterGPUMachine Learning
Responsibilities
Ensure new server, storage and network infrastructure is properly racked, labeled, cabled, and configured
Document data center layout and network topology in DCIM software
Work with supply chain & manufacturing teams to ensure timely deployment of systems and project plans for large-scale deployments
Participate in data center capacity and roadmap planning with sales and customer success teams to allocate floorspace
Assess current and future state data center requirements based on growth plans and technology trends
Manage a parts depot inventory and track equipment through the delivery-store-stage-deploy-handoff process in each of our data centers
Work closely with HW Support team to ensure data center infrastructure-related support tickets are resolved
Work with RMA team to ensure faulty parts are returned and replacements are ordered
Create installation standards and documentation for placement, labeling, and cabling to drive consistency and discoverability across all data centers
Serve as a subject-matter expert on data center deployments as part of sales engagement for large-scale deployments in our data centers and at customer sites
Qualification
Required
Have experience with critical infrastructure systems supporting data centers, such as power distribution, air flow management, environmental monitoring, capacity planning, DCIM software, structured cabling, and cable management
Have strong Linux administration experience
Have experience in setting up networking appliances (Ethernet and InfiniBand) across multiple data center locations
You are action-oriented and have a strong willingness to learn
You are willing to travel for bring up of new data center locations
Preferred
Experience with troubleshooting the following network layers, technologies, and system protocols: TCP/IP, DP/IP, BGP, OSPF, SNMP, SSL, HTTP, FTP, SSH, Syslog, DHCP, DNS, RDP, NETBIOS, IP routing, Ethernet, switched Ethernet, 802.11x, NFS, and VLANs
Experience with working in large-scale distributed data center environments
Experience working with auditors to meet all compliance requirements (ISO/SOC)
Benefits
Health, dental, and vision coverage for you and your dependents
Wellness and commuter stipends for select roles
401k Plan with 2% company match (USA employees)
Flexible paid time off plan that we all actually use
Company
Lambda
Lambda is a cloud-based platform that provides high-performance GPU hardware and cloud infrastructure for AI model training and inference.
H1B Sponsorship
Lambda has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (16)
2024 (1)
2023 (3)
2022 (2)
2021 (2)
2020 (3)
Funding
Current Stage
Late StageTotal Funding
$3.19BKey Investors
TWG GlobalJP MorganMacquarie Group
2025-11-18Series E· $1.5B
2025-08-19Debt Financing· $275M
2025-02-19Series D· $480M
Recent News
2026-01-09
2026-01-08
Company data provided by crunchbase