d-Matrix · 20 hours ago
Machine Learning Intern - Dynamic KV-Cache Modeling for Efficient LLM Inference
d-Matrix is focused on unleashing the potential of generative AI to transform technology. They are seeking a motivated and innovative Machine Learning Intern to develop a dynamic Key-Value (KV) cache solution for Large Language Model (LLM) inference, aimed at enhancing memory utilization and execution efficiency on D-Matrix hardware.
AI InfrastructureArtificial Intelligence (AI)Cloud InfrastructureData CenterSemiconductor
Responsibilities
Research and analyze existing KV-Cache implementations used in LLM inference, particularly those utilizing lists of past-key-values PyTorch tensors
Investigate “Paged Attention” mechanisms that leverage dedicated CUDA data structures to optimize memory for variable sequence lengths
Design and implement a torch-native dynamic KV-Cache model that can be integrated seamlessly within PyTorch
Model KV-Cache behavior within the PyTorch compute graph to improve compatibility with torch.compile and facilitate the export of the compute graph
Conduct experiments to evaluate memory utilization and inference efficiency on D-Matrix hardware
Qualification
Required
Currently pursuing a degree in Computer Science, Electrical Engineering, Machine Learning, or a related field
Familiarity with PyTorch and deep learning concepts, particularly regarding model optimization and memory management
Understanding of CUDA programming and hardware-accelerated computation (experience with CUDA is a plus)
Strong programming skills in Python, with experience in PyTorch
Analytical mindset with the ability to approach problems creatively
Preferred
Experience with deep learning model inference optimization
Knowledge of data structures used in machine learning for memory and compute efficiency
Experience with hardware-specific optimization, especially on custom hardware like D-Matrix, is an advantage
Company
d-Matrix
D-Matrix is a platform that enables data centers to handle large-scale generative AI inference with high throughput and low latency.
H1B Sponsorship
d-Matrix has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (20)
2024 (15)
2023 (8)
2022 (7)
Funding
Current Stage
Growth StageTotal Funding
$429MKey Investors
Temasek HoldingsTSVC
2025-11-12Series C· $275M
2023-09-06Series B· $110M
2022-04-20Series A· $44M
Recent News
2026-01-24
2026-01-22
2025-12-22
Company data provided by crunchbase