Apply on Employer Site

Vercept · 4 months ago

Backend Engineer – Inference Optimization

Seattle

Full-time

Onsite

Mid Level

$150K/yr - $250K/yr

Vercept is a high-energy, impact-driven team known for its academic excellence and transformative research in AI. They are seeking a Backend Engineer – Inference Optimization to design and optimize inference pipelines for large-scale models, collaborating with researchers and infrastructure engineers to enhance AI performance.

Artificial Intelligence (AI)Computer

Responsibilities

Own the design and optimization of inference pipelines for large-scale models

Work closely with researchers and infrastructure engineers to identify bottlenecks

Implement advanced techniques like quantization and KV caching

Deploy high-performance serving systems in production

Qualification

Model inference optimizationBackend systems programmingQuantization techniquesKV cachingDistributed servingGPU accelerationLarge-scale systemsDebugging performance issuesFast-moving environmentsVLLMSimilar frameworksGPU kernel optimizationScaling inferenceModel compilation

Required

Deep experience in optimizing model inference pipelines, model quantization and KV caching

Proficiency in backend systems and high-performance programming (Python, C++, or Rust)

Familiarity with distributed serving, GPU acceleration, and large-scale systems

Ability to debug complex performance issues across model, runtime, and hardware layers

Comfort working in fast-moving environments with ambitious technical goals

Preferred

Hands-on experience with vLLM or similar inference frameworks

Background in GPU kernel optimization (CUDA, Triton, ROCm)

Experience scaling inference across multi-node or heterogeneous clusters

Prior work in model compilation (e.g., TensorRT, TVM, ONNX Runtime)

Hands-on experience with model quantization

Benefits

Health benefits

A 401(k) plan

Meaningful equity

Company

Vercept

Vercept is a software development company.

Founded in 2024

Seattle, Washington, USA

2-10 employees

https://vercept.com

Funding

Current Stage

Early Stage

Total Funding

$16M

2025-06-04Seed· $16M

Recent News

GeekWire

Uncommon Thinkers: Kiana Ehsani thrives on scrappy persistence at AI startup and in the outdoors

2025-12-03

Company data provided by crunchbase