Microsoft · 1 week ago
Member of Technical Staff, Pre-Training Infrastructure - MAI Superintelligence Team
Microsoft AI is looking for a Member of Technical Staff, Pre-Training Infrastructure, to help build the next wave of capabilities for their personalized AI assistant, Copilot. The role involves designing and optimizing distributed training infrastructure for large-scale GPU clusters and contributing to the development of AI models powering innovative products.
Agentic AIApplication Performance ManagementArtificial Intelligence (AI)Business DevelopmentDevOpsInformation ServicesInformation TechnologyManagement Information SystemsNetwork SecuritySoftware
Responsibilities
Design, implement, test, and optimize distributed training infrastructure in Python and C++ for large-scale GPU clusters
Profile, benchmark, and debug performance bottlenecks across compute, memory, networking, and storage subsystems
Optimize collective communication libraries (e.g., NCCL) for emerging NVLink and InfiniBand topologies
Collaborate with hardware teams to optimize for next-generation accelerators (NVIDIA, AMD, and beyond)
Gather data and insights to develop the pretraining compute roadmap
Care deeply about conversational AI and its deployment
Actively contribute to the development of AI models powering our innovative products
Find solutions to overcome roadblocks and deliver your work to users quickly and iteratively
Enjoy working in a fast-paced, design-driven product development cycle
Embody our Culture and Values
Qualification
Required
Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Preferred
Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Experience in distributed computing and large-scale systems
Experience with GPU programming (CUDA, NCCL) and frameworks such as PyTorch
Proven ability to profile, benchmark, and optimize performance-critical systems
Experience in leading technical projects and supporting architectural decisions with data
Experience building infrastructure for large-scale machine learning or generative AI workloads
Experience in networking (InfiniBand, NVLink), storage systems, or distributed training parallelisms
Track record of contributing to high-performance computing or large-scale AI infrastructure projects
Company
Microsoft
Microsoft is a software corporation that develops, manufactures, licenses, supports, and sells a range of software products and services.
H1B Sponsorship
Microsoft has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (9192)
2024 (9343)
2023 (7677)
2022 (11403)
2021 (7210)
2020 (7852)
Funding
Current Stage
Public CompanyTotal Funding
$1MKey Investors
Technology Venture Investors
2022-12-09Post Ipo Equity
1986-03-13IPO
1981-09-01Series Unknown· $1M
Leadership Team
Recent News
2026-01-16
Morningstar.com
2026-01-16
Company data provided by crunchbase