Apply on Employer Site

d-Matrix · 9 hours ago

Senior Staff Machine Learning Engineer -Frameworks

Santa Clara, CA

Full-time

Hybrid

Senior Level, Lead/Staff

7+ years exp

d-Matrix is a pioneering company specializing in data center AI inferencing solutions, focused on unleashing the potential of generative AI. They are seeking a Senior Staff Machine Learning Engineer - Frameworks to design, build, and optimize machine learning deployment pipelines for large-scale models, enhancing the efficiency and scalability of generative AI applications.

Artificial Intelligence (AI)SemiconductorCloud ComputingAI InfrastructureCloud InfrastructureData Center

H1B Sponsor Likely

Responsibilities

Design, build, and optimize machine learning deployment pipelines for large-scale models

Implement and enhance model inference frameworks

Develop automated workflows for model development, experimentation, and deployment

Collaborate with research, architecture, and engineering teams to improve model performance and efficiency

Work with distributed computing frameworks (e.g., PyTorch/XLA, JAX, TensorFlow, Ray) to optimize model parallelism and deployment

Implement scalable KV caching and memory-efficient inference techniques for transformer-based models

Monitor and optimize infrastructure performance across different levels of custom hardware hierarchy—cards, servers, and racks which are powered by the d-Matrix custom AI chips

Ensure best practices in ML model versioning, evaluation, and monitoring

Qualification

PythonMachine Learning frameworksModel optimizationDistributed computing frameworksQuantization techniquesSoftware engineering best practicesProblem-solving skillsCollaborationFast-paced environment

Required

BS in Computer Science with 7+ years of strong programming skills in Python and experience with ML frameworks like PyTorch, TensorFlow, or JAX

Hands-on experience with model optimization, quantization, and inference acceleration

Deep understanding of transformer architectures, attention mechanisms, and distributed inference (tensor parallel, pipeline parallel, sequence parallel)

Knowledge of quantization (INT8, BF16, FP16) and memory-efficient inference techniques

Solid grasp of software engineering best practices, including CI/CD, containerization (Docker, Kubernetes), and MLOps

Strong problem-solving skills and ability to work in a fast-paced, iterative development environment

Preferred

Experience working with cloud-based ML pipelines (AWS, GCP, or Azure)

Experience with LLM fine-tuning, LoRA, PEFT, and KV cache optimizations

Contributions to open-source ML projects or research publications

Experience with low-level optimizations using CUDA, Triton, or XLA

Benefits

Competitive compensation, benefits, and opportunities for career growth

Company

d-Matrix

D-Matrix is a platform that enables data centers to handle large-scale generative AI inference with high throughput and low latency.

Founded in 2019

Santa Clara, California, USA

51-200 employees

https://www.d-matrix.ai

H1B Sponsorship

d-Matrix has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (20)

2024 (15)

2023 (8)

2022 (7)

Funding

Current Stage

Growth Stage

Total Funding

$429M

Key Investors

Bullhound Capital,Temasek Holdings,Triatomic CapitalTemasek HoldingsM12 - Microsoft's Venture Fund,Playground Global,SK Hynix

2025-11-12Series C· $275M

2023-09-06Series B· $110M

2022-04-20Series A· $44M

Leadership Team

Peter Buckingham

Senior Vice President, Software Engineering

Recent News

EE Times

As Demand for Fast AI Tokens Grows, D-Matrix Develops Fast NIC

2026-01-24

The New Stack

Why d-Matrix bets on in-memory compute to break the AI inference bottleneck

2026-01-22

Crowdfund Insider

AI Adoption Trends : 2025 Saw Emergence of 1000+ Agentic AI Offerings

2025-12-22

Company data provided by crunchbase