Member of Technical Staff, ML Systems jobs in United States
cer-icon
Apply on Employer Site
company-logo

Netpreme ยท 3 weeks ago

Member of Technical Staff, ML Systems

Netpreme is seeking a motivated LLM Systems Engineer to explore new and unconventional inference systems based on emerging hardware. This role combines engineering and research, focusing on prototyping algorithms for inference hardware and guiding the hardware team on product definition.

Computer Software
check
H1B Sponsorednote

Responsibilities

Prototype and optimize emerging ML inference systems
Develop novel memory models for expandable vRAM
Write efficient GPU kernels for data movement
Perform design-space exploration, implementation, and benchmarking of inference engines, both in simulations and on real hardware

Qualification

LLM inference systemsAccelerator programmingComputer systemsPerformance engineeringCollaborative work

Required

MS or PhD in computer systems, ideally with a focus on LLM inference and/or distributed systems
Prior experience contributing to the core LLM inference infrastructures (vLLM, SGLang, TensorRT, etc.)
Prior experience in accelerator programming (e.g. CUDA, JAX/Pallas, ROCm)

Preferred

Advanced computer architectures and performance engineering skills is a big plus

Benefits

Comprehensive benefits including health, dental, vision, and life insurance.
Relocation assistance and visa sponsorship.
Perks include a daily lunch stipend, 401k match, and more.

Company

Netpreme

twitter
company-logo
Empowering AI Systems with Supreme Networking

Funding

Current Stage
Early Stage
Company data provided by crunchbase