Compiler Architect jobs in United States
cer-icon
Apply on Employer Site
company-logo

d-Matrix · 1 week ago

Compiler Architect

d-Matrix is focused on unleashing the potential of generative AI to transform technology. They are seeking a hands-on Software Compiler Architect to drive the design and implementation of a scalable MLIR-based compiler framework optimized for deploying large-scale AI models in cloud environments.

Artificial Intelligence (AI)Cloud InfrastructureData CenterSemiconductor
check
H1B Sponsor Likelynote

Responsibilities

Architect the MLIR-based compiler for cloud inference workloads, focusing on efficient mapping of large-scale AI models (e.g., LLMs, Transformers, Torch-MLIR) onto distributed compute and memory hierarchies
Lead the development of compiler passes for model partitioning, operator fusion, tensor layout optimization, memory tiling, and latency-aware scheduling
Design support for hybrid offline/online compilation and deployment flows with runtime-aware mapping, allowing for adaptive resource utilization and load balancing in cloud scenarios
Define compiler abstractions that interoperate efficiently with runtime systems, orchestration layers, and cloud deployment frameworks
Drive scalability, reproducibility, and performance through well-designed IR transformations and distributed execution strategies
Mentor and guide a team of compiler engineers to deliver high-performance inference-optimized software stacks

Qualification

MLIRLLVMModel optimizationCloud infrastructureAI frameworksCompiler designHeterogeneous computeLeadershipCommunication

Required

BS 15+ Yrs / MS 12+ Yrs / PhD 10+ Yrs Computer Science or Electrical Engineering, with 12+ years of experience in Front End Compiler and systems software development, with a focus on ML inference
Deep experience in designing or leading compiler efforts using MLIR, LLVM, Torch-MLIR, or similar frameworks
Strong understanding of model optimization for inference: quantization, fusion, tensor layout transformation, memory hierarchy utilization, and scheduling
Expertise in deploying ML models to heterogeneous compute environments, with specific attention to latency, throughput, and resource scaling in cloud systems
Proven track record working with AI frameworks (e.g., PyTorch, TensorFlow), ONNX, and hardware backends
Experience with cloud infrastructure, including resource provisioning, distributed execution, and profiling tools

Preferred

Experience targeting inference accelerators (AI ASICs, FPGAs, GPUs) in cloud-scale deployments
Knowledge of cloud deployment orchestration (e.g., Kubernetes, containerized AI workloads)
Strong leadership skills with experience mentoring teams and collaborating with large-scale software and hardware organizations
Excellent written and verbal communication; capable of presenting complex compiler architectures and trade-offs to both technical and executive stakeholders

Company

d-Matrix

twittertwittertwitter
company-logo
D-Matrix is a platform that enables data centers to handle large-scale generative AI inference with high throughput and low latency.

H1B Sponsorship

d-Matrix has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (20)
2024 (15)
2023 (8)
2022 (7)

Funding

Current Stage
Growth Stage
Total Funding
$429M
Key Investors
Temasek HoldingsTSVC
2025-11-12Series C· $275M
2023-09-06Series B· $110M
2022-04-20Series A· $44M

Leadership Team

leader-logo
Peter Buckingham
Senior Vice President, Software Engineering
linkedin
Company data provided by crunchbase