SIGN IN
Large Model Inference Acceleration Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

OCBridge · 1 day ago

Large Model Inference Acceleration Engineer

OCBridge is an AI platform engineering team building large-scale end-to-end AI production pipelines. They are seeking an experienced AI model optimization engineer specializing in large model inference acceleration to optimize inference performance and scalability for generative and foundation models across diverse hardware environments.
ConsultingHuman ResourcesRecruiting
check
H1B Sponsor Likelynote

Responsibilities

Design and optimize large model inference pipelines for low-latency and high-throughput production deployments
Apply high-performance optimization techniques across diverse hardware architectures
Benchmark and profile deep learning models to identify performance bottlenecks
Optimize compute, memory, and kernel performance for large model inference
Work on distributed inference and acceleration strategies
Collaborate with infrastructure and production engineering teams to integrate optimized models into production systems

Qualification

AI model optimizationCUDA programmingPythonC++ML compilersInference acceleration frameworksParallel computingTransformer architecturesPerformance debuggingMandarinEnglish

Required

Master's or PhD in Computer Science, Electrical Engineering, AI, or related field
Strong software engineering skills in Python and C++
Strong CUDA programming experience
5+ years of experience in AI model inference optimization or acceleration
Experience with ML compilers and performance optimization techniques
Experience with parallel computing, graph fusion, and kernel optimization
Hands-on experience with inference acceleration frameworks such as TensorRT, Triton, or Cutlass
Solid understanding of transformer and diffusion model architectures
Strong system-level performance debugging skills
Professional working proficiency in Mandarin and English required for cross-regional technical collaboration

Preferred

Experience optimizing large generative or multimodal models in production
Experience with distributed inference systems
Experience with hardware-aware model optimization
Experience working closely with AI infrastructure or ML systems teams

Company

OCBridge

twittertwitter
company-logo
OCBridge is a leader in AI-powered recruitment, delivering talent with unmatched speed, accuracy, and scale.

H1B Sponsorship

OCBridge has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
2024 (3)
2023 (6)
2022 (2)
2021 (2)
2020 (2)

Funding

Current Stage
Growth Stage

Leadership Team

leader-logo
Kirby Deng
Founder and CEO
linkedin
Company data provided by crunchbase