Netpreme ยท 3 weeks ago
Member of Technical Staff, ML Systems
Netpreme is seeking a motivated LLM Systems Engineer to explore new and unconventional inference systems based on emerging hardware. This role combines engineering and research, focusing on prototyping algorithms for inference hardware and guiding the hardware team on product definition.
Computer Software
Responsibilities
Prototype and optimize emerging ML inference systems
Develop novel memory models for expandable vRAM
Write efficient GPU kernels for data movement
Perform design-space exploration, implementation, and benchmarking of inference engines, both in simulations and on real hardware
Qualification
Required
MS or PhD in computer systems, ideally with a focus on LLM inference and/or distributed systems
Prior experience contributing to the core LLM inference infrastructures (vLLM, SGLang, TensorRT, etc.)
Prior experience in accelerator programming (e.g. CUDA, JAX/Pallas, ROCm)
Preferred
Advanced computer architectures and performance engineering skills is a big plus
Benefits
Comprehensive benefits including health, dental, vision, and life insurance.
Relocation assistance and visa sponsorship.
Perks include a daily lunch stipend, 401k match, and more.
Company
Netpreme
Empowering AI Systems with Supreme Networking
Funding
Current Stage
Early StageCompany data provided by crunchbase