Apply on Employer Site

Perplexity · 2 weeks ago

AI Inference Engineer (San Francisco)

San Francisco, CA

Full-time

Onsite

Mid Level

$200K/yr - $350K/yr

Perplexity is seeking an AI Inference Engineer to join their growing team. The role involves working on large-scale deployment of machine learning models for real-time inference, developing APIs, and improving system reliability and observability.

Artificial Intelligence (AI)ChatbotMachine LearningNatural Language ProcessingSearch Engine

H1B Sponsor Likely

Responsibilities

Develop APIs for AI inference that will be used by both internal and external customers

Benchmark and address bottlenecks throughout our inference stack

Improve the reliability and observability of our systems and respond to system outages

Explore novel research and implement LLM inference optimizations

Qualification

PythonPyTorchCUDARustC++KubernetesTensorFlowONNXLLM architecturesInference optimization

Required

Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)

Familiarity with common LLM architectures and inference optimization techniques (e.g. continuous batching, quantization, etc.)

Understanding of GPU architectures or experience with GPU kernel programming using CUDA

Company

Perplexity

Perplexity is an AI-powered answer engine designed to provide accurate, real-time responses to user queries.

Founded in 2022

San Francisco, California, USA

201-500 employees

https://www.perplexity.ai

H1B Sponsorship

Perplexity has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (12)

2024 (7)

2023 (2)

Funding

Current Stage

Late Stage

Total Funding

$1.48B

Key Investors

Cristiano RonaldoNuVenturesAccel

2025-12-05Undisclosed

2025-09-10Series Unknown· $200M

2025-08-15Secondary Market

Leadership Team

Aravind Srinivas

Cofounder, President, CEO

Denis Yarats

Co-Founder & CTO

Recent News

SiliconANGLE

Google debuts Universal Commerce Protocol to streamline agentic shopping automation

2026-01-12

Business Insider

The AI industry is getting into politics. Here are the key super PACs to watch in 2026.

2026-01-11

Droid Life

Screenshots Confirm Rebirth of Bixby, Powered by Perplexity AI

2026-01-11

Company data provided by crunchbase