Perplexity · 2 weeks ago
AI Inference Engineer (San Francisco)
Perplexity is seeking an AI Inference Engineer to join their growing team. The role involves working on large-scale deployment of machine learning models for real-time inference, developing APIs, and improving system reliability and observability.
Artificial Intelligence (AI)ChatbotMachine LearningNatural Language ProcessingSearch Engine
Responsibilities
Develop APIs for AI inference that will be used by both internal and external customers
Benchmark and address bottlenecks throughout our inference stack
Improve the reliability and observability of our systems and respond to system outages
Explore novel research and implement LLM inference optimizations
Qualification
Required
Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)
Familiarity with common LLM architectures and inference optimization techniques (e.g. continuous batching, quantization, etc.)
Understanding of GPU architectures or experience with GPU kernel programming using CUDA
Company
Perplexity
Perplexity is an AI-powered answer engine designed to provide accurate, real-time responses to user queries.
H1B Sponsorship
Perplexity has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (12)
2024 (7)
2023 (2)
Funding
Current Stage
Late StageTotal Funding
$1.48BKey Investors
Cristiano RonaldoNuVenturesAccel
2025-12-05Undisclosed
2025-09-10Series Unknown· $200M
2025-08-15Secondary Market
Recent News
2026-01-12
Business Insider
2026-01-11
2026-01-11
Company data provided by crunchbase