Microsoft · 2 weeks ago
Senior ML Software Engineer - Quantization & Numerics
Microsoft is a leading technology company focused on empowering individuals and organizations to achieve more. They are seeking a Senior ML Software Engineer to design and develop quantization and numerics kernels for efficient deployment of LLM inference and training in Azure production environments, collaborating with various teams to optimize performance and ensure system efficiency.
Agentic AIApplication Performance ManagementArtificial Intelligence (AI)Business DevelopmentDevOpsInformation ServicesInformation TechnologyManagement Information SystemsNetwork SecuritySoftware
Responsibilities
Design and develop novel quantization and numerics kernels to enable efficient deployment of LLM inference and training in Microsoft’s Azure production environments
Drive software development and model optimization tooling proof-of-concept effort to streamline deployment of quantized models
Analyze performance bottlenecks in quantized state-of-the-art LLM architectures and drive performance improvements
Prototype and evaluate emerging low-precision data formats through proof-of-concept implementations on novel hardware accelerator SDK
Co-design model architecture optimized for low-precision deployment in close collaboration with companywide AI/ML teams
Work cross-functionally with data scientists and ML researchers/engineers across organizations to align on model accuracy and performance goals
Partner with hardware architecture and AI software framework teams to ensure end-to-end system efficiency
Qualification
Required
Bachelor's Degree in Computer Science, Electrical or Computer Engineering, or related field AND 4+ years of industry experience in high-performance ML systems, GPU kernel development, or ML runtime/infrastructure development OR Master's Degree in Computer Science, Electrical or Computer Engineering, or related field AND 3+ years of industry experience in high-performance ML systems, GPU kernel development, or ML runtime/infrastructure development OR Doctorate in Computer Science, Electrical or Computer Engineering, or related field AND 1+ year(s) of industry experience in high-performance ML systems, GPU kernel development, or ML runtime/infrastructure development
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
Preferred
Demonstrated experience delivering production-grade software in areas such as model compression, low-precision numerics (FP8, INT8/4, NVFP4, MX formats, etc.), low-level kernel development, and performance optimization
Proficiency with modern deep learning frameworks, including PyTorch, TensorFlow, TensorRT, and ONNX Runtime
Expertise in GPU/NPU kernel development using CUDA, Triton, ROCm, or comparable frameworks and fast model bring up on a new stack
Strong understanding of Transformer and LLM architectures, with hands-on experience in optimization techniques such as quantization, pruning, tensor/parameter sharding, model parallelism, KV-cache optimization, and Flash Attention etc
Practical experience with large-scale model evaluation, including benchmarking state-of-the-art LLMs and fine-tuning (SFT or RL) large models
Solid programming skills in Python, C, and C++
Excellent communication abilities and a proven capacity to collaborate effectively in hybrid team-oriented environments
Hands-on experience implementing and optimizing low-level linear algebra routines, including custom BLAS kernels would be a plus
Deep knowledge of mixed-precision arithmetic units, including numerical formats and microarchitecture, is highly desirable
Company
Microsoft
Microsoft is a software corporation that develops, manufactures, licenses, supports, and sells a range of software products and services.
H1B Sponsorship
Microsoft has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (9192)
2024 (9343)
2023 (7677)
2022 (11403)
2021 (7210)
2020 (7852)
Funding
Current Stage
Public CompanyTotal Funding
$1MKey Investors
Technology Venture Investors
2022-12-09Post Ipo Equity
1986-03-13IPO
1981-09-01Series Unknown· $1M
Leadership Team
Recent News
MarketScreener
2026-01-06
2026-01-06
Company data provided by crunchbase