xAI · 2 months ago
Software Engineer - Reliability, bare metal
xAI is a company dedicated to creating AI systems that understand the universe and aid humanity. They are looking for a Software Engineer to join their SuperComputing team to ensure the reliability and performance of their high-performance computing infrastructure, collaborating with cross-functional teams to support AI research.
Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
Responsibilities
Design, implement, and maintain robust, scalable infrastructure for supercomputing environments
Monitor and optimize system performance, ensuring high availability and minimal downtime
Develop automation tools and scripts to streamline operations and improve system reliability
Troubleshoot complex issues across distributed systems, networks, and storage solutions
Collaborate with AI researchers and engineers to support compute-intensive workloads
Implement security best practices to protect sensitive data and infrastructure
Contribute to capacity planning and disaster recovery strategies
Participate in an on-call rotation to ensure 24/7 system reliability
Qualification
Required
Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience)
3+ years of experience in site reliability engineering, DevOps, or systems engineering
Proficiency in Linux system administration and scripting (e.g., Python, Bash)
Experience with containerization (e.g., Docker, Kubernetes) and cloud platforms (e.g., AWS, GCP, Azure)
Strong understanding of networking, distributed systems, and storage technologies
Familiarity with HPC environments, GPU clusters, or large-scale data processing
Excellent problem-solving skills and ability to work in a fast-paced, dynamic environment
Strong communication skills and a collaborative mindset
Preferred
Experience with Infrastructure as Code (e.g., Terraform, Ansible) or monitoring tools (e.g., Prometheus, Grafana)
Benefits
Equity
Comprehensive medical, vision, and dental coverage
Access to a 401(k) retirement plan
Short & long-term disability insurance
Life insurance
Various other discounts and perks
Company
xAI
XAI is an artificial intelligence startup that develops AI solutions and tools to enhance reasoning and search capabilities.
H1B Sponsorship
xAI has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
Funding
Current Stage
Late StageTotal Funding
$42.73BKey Investors
Neptune Digital AssetsSpaceXMorgan Stanley
2026-01-06Series E· $20B
2025-12-11Secondary Market· $0.3M
2025-07-13Corporate Round· $5.32B
Recent News
2026-01-16
2026-01-16
Company data provided by crunchbase