SIGN IN
AI Infrastructure Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

NIO · 23 hours ago

AI Infrastructure Engineer

NIO is a pioneer and a leading company in the premium smart electric vehicle market. They are seeking a Senior AI Infrastructure Engineer to design, implement, and deliver production-grade software for high-performance AI inference systems. The role involves shaping the architecture and performance of AI capabilities for their AIOS platform, focusing on large-scale deployment and optimizations.
Electric VehicleTransportationElectronicsAutomotiveAutonomous Vehicles
check
H1B Sponsor Likelynote
Hiring Manager
Tom Q. Zhang
linkedin

Responsibilities

Design and implement high-performance, scalable inference systems for LLMs and VLMs across cloud, edge, and edge-cloud hybrid platforms
Develop and optimize custom kernels and operators for specific hardware accelerators (GPU, NPU, DSP, etc.), improving throughput, latency, and memory efficiency
Integrate advanced optimization techniques such as KV-cache management, tensor/model parallelism, quantization, and memory-efficient execution into production inference systems
Partner with system and hardware teams to ensure tight hardware-software integration and optimal performance across diverse compute environments
Translate architectural requirements into robust, maintainable, production-ready software that meets performance, safety, and reliability standards
Define and drive the evolution roadmap for LLM/VLM inference in the AIOS stack, ensuring scalability and adaptability to new workloads
Stay ahead of industry trends and competitor solutions, applying best practices from both AI and large-scale systems engineering

Qualification

AI inference systemsLLM/VLM model internalsGPU/NPU programmingC/C++ programmingPerformance engineeringDeep learning frameworksComputer architectureCommunicationCollaboration skills

Required

5+ years of hands-on software development experience in building and optimizing AI inference systems at scale
Direct experience in LLM/VLM model internals, including Transformer-based architectures, inference bottlenecks, and optimization techniques
Strong expertise in performance engineering: kernel development, parallelism strategies, memory optimization, and distributed inference systems
Proficiency with GPU/NPU programming (CUDA, or vendor-specific SDKs), compiler toolchains, and deep learning frameworks (PyTorch, or TensorFlow)
Strong programming skills in C/C++, with a track record of delivering high-performance, production-grade software
Solid foundation in computer architecture, systems programming (CPU/GPU pipelines, memory hierarchy, scheduling), and embedded systems
BS/MS in Computer Science, Computer Engineering, or related technical field
Excellent communication and collaboration skills, with the ability to work across cross-functional teams

Preferred

Master's or PhD degree in Computer Science, Electrical/Computer Engineering, or related fields, plus 5 years industry experience
Experience building inference serving systems for large models, including batching, scheduling, caching, and load balancing
Expertise in hardware-aware model optimization (e.g., kernel fusion, mixed precision, quantization, pruning)
Familiarity with edge and embedded AI, including real-time constraints and limited-resource optimization
Contributions to widely used AI frameworks, libraries, or performance-critical software (open source or proprietary)

Benefits

CIGNA EPO, HSA, and Kaiser HMO medical plans with $0 for Employee Only Coverage.
Dental (including orthodontic coverage) and vision plan.
Company Paid HSA (Health Savings Account) Contribution when enrolled in the High Deductible CIGNA medical plan
Healthcare and Dependent Care Flexible Spending Accounts (FSA)
401(k) with Brokerage Link option
Company paid Basic Life, AD&D, short-term and long-term disability insurance
Employee Assistance Program
Sick and Vacation time
13 Paid Holidays a year
Paid Parental Leave for first 8 weeks at full pay (eligible after 90 days of employment with NIO)
Paid Disability Leave for first 6 weeks at full pay (eligible after 90 days of employment with NIO)
Voluntary benefits including: Voluntary Life and AD&D options for you, your spouse/domestic partner and dependent child(ren), pet insurance
Mobile Cell Phone Credit
Healthjoy mobile benefit app supporting you and your dependents with benefit questions on the go & support with benefit billing questions
Free lunch and snacks
Onsite gym
Employee discounts and perks program

Company

NIO is an automotive company that designs and develops electric autonomous vehicles.

H1B Sponsorship

NIO has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (10)
2024 (7)
2023 (32)
2022 (31)
2021 (18)
2020 (25)

Funding

Current Stage
Public Company
Total Funding
$11.89B
Key Investors
Contemporary Amperex TechnologyHefei Jianxiang InvestmentCYVN Holdings
2025-09-10Post Ipo Equity· $1.16B
2025-03-27Post Ipo Equity· $518.3M
2025-03-18Post Ipo Equity· $345.89M

Leadership Team

leader-logo
Shaoqing Ren
VP, Autonomous Driving Development
linkedin
Company data provided by crunchbase