xAI · 2 weeks ago
AI/HPC Network Development Engineer - Networking
xAI is on a mission to create AI systems that can accurately understand the universe and aid humanity. They are seeking an AI/HPC Network Development Engineer to optimize network performance and availability for their training models and customer inference queries.
Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
Responsibilities
Develop at hyper scale while optimizing performance and availability
Spend most of your days deep inside NCCL, building metric dashboards and tweaking configurations to ensure no performance is left on the table
Help design the next iteration of our backend and front-end networks that will allow us to seamlessly build-out new GPU infrastructure with little to no engineering assistance
Participate in a team on-call rotation and help on other scaling and maintenance efforts
Qualification
Required
A minimum of 10 years designing and operating large scale networks with 5 years in the ethernet AI/HPC space
Deep understanding of congestion control on ethernet with Infiniband an added bonus
Deep understanding of AI training and inference workloads and how they operate on the network. As part of this you are able to use and debug NCCL and potentially commit to the library
Expertise in creating a portfolio of metrics for performance and operations to optimize the fleet for training and inference traffic
Experience with Python to automate away repetitive tasks and facilitate your daily job working with and analyzing large sets of data
Benefits
Equity
Comprehensive medical, vision, and dental coverage
Access to a 401(k) retirement plan
Short & long-term disability insurance
Life insurance
Various other discounts and perks
Company
xAI
XAI is an artificial intelligence startup that develops AI solutions and tools to enhance reasoning and search capabilities.
H1B Sponsorship
xAI has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
Funding
Current Stage
Late StageTotal Funding
$42.73BKey Investors
Neptune Digital AssetsSpaceXMorgan Stanley
2026-01-06Series E· $20B
2025-12-11Secondary Market· $0.3M
2025-07-13Corporate Round· $5.32B
Recent News
2026-01-12
Company data provided by crunchbase