AI Performance Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Parasail · 2 months ago

AI Performance Engineer

Parasail is redefining AI infrastructure by enabling seamless deployment across a distributed network of GPUs. The AI Performance Engineer plays a crucial role in efficiently scheduling, executing, and managing AI workloads on distributed compute systems, ensuring that generative AI operates efficiently at enterprise scale while driving continuous improvements in cost, performance, and sustainability.

Artificial Intelligence (AI)Network HardwareOpen Source

Responsibilities

Add support for new LLMs, working across the stack from low-level GPU kernels to Kubernetes-based deployments
Contribute to cutting-edge open-source LLM engines such as vLLM or SGLang to extend their capabilities and performance (e.g. use Python technologies to improve API servers or request schedulers)
Operate closer to the hardware, focusing on building and integrating solutions to boost performance and hardware utilization. For example, improve attention backends like FlashAttention or FlashInfer by contributing to their development and optimization, or by integrating their solutions into vLLM
Improve LLM performance using advanced algorithmic solutions such as speculative decoding, quantization, or other state-of-the-art techniques. Understand the impact of such techniques in model quality

Qualification

GPU computingCUDAPerformance optimizationPythonC++KubernetesOpen-source contributionsProduction-oriented mindsetCuriosity about AI

Required

Expertise in GPU computing, including low-level platforms such as CUDA, ROCm, XLA, PyTorch, Jax, etc
Background in performance analysis and optimization of AI/HPC workloads (e.g. profiling or theoretical analysis of Flops and bandwidth)
Experience in writing GPU kernels using technologies like CUDA, CUTLASS, Triton
Strength in Python and C++
Demonstrated contributions to open-source projects. Contributions to inference engines such as vLLM is a strong plus
A production-oriented mindset emphasizing robust, scalable code suitable for enterprise-grade applications
A relentless curiosity about cutting-edge AI technologies combined with a passion for solving complex problems

Company

Parasail

twittertwittertwitter
company-logo
Parasail is an AI deployment networking company that provides AI products to builders and innovators for secure compute.

Funding

Current Stage
Early Stage
Total Funding
$10M
2025-11-24Series A
2025-04-02Seed· $10M
Company data provided by crunchbase