Solution Architect - AI Inference Specialist jobs in United States
cer-icon
Apply on Employer Site
company-logo

FriendliAI · 1 week ago

Solution Architect - AI Inference Specialist

FriendliAI is seeking a Forward Deployed Engineer to assist enterprises in deploying, scaling, and operating generative and agentic AI workloads. The role involves collaborating with customers to solve AI inference challenges and implementing production-grade applications using FriendliAI's infrastructure.

Artificial Intelligence (AI)Generative AIInformation TechnologyInternetSaaSSoftware
Hiring Manager
Woojin Lee
linkedin

Responsibilities

Design and implement large-scale deployment architectures for LLM and multimodal inference
Deploy and manage containerized workloads across Kubernetes clusters
Diagnose production issues, such as performance bottlenecks, and implement temporary fixes as needed
Collaborate with customers’ DevOps teams to integrate FriendliAI’s infrastructure into their CI/CD workflows
Develop scripts, Helm charts, and Terraform modules that simplify repeated deployments
Contribute field insights to shape our platform reliability, observability, and scaling strategies
Lead workshops, technical sessions, or webinars to help customers master infrastructure best practices

Qualification

KubernetesDockerTerraformAI model servingCloud infrastructureDevOpsDistributed systemsPerformance tuningDebuggingNetworkingProblem-solving

Required

3+ years of experience in cloud infrastructure, DevOps, or reliability engineering
Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
Proficiency with Kubernetes, Docker, Terraform, and Helm
Strong foundation in distributed systems, networking, and performance tuning
Experience with GPU-based computing and generative AI model serving workloads
Strong technical background in backend systems or AI tooling
Experience operating workloads on AWS, GCP, or OCI
Excellent problem-solving and debugging skills in real-world environments

Preferred

Experience deploying large models (LLMs, diffusion models) on GPUs or clusters
Familiarity with inference frameworks (Triton, vLLM, TensorRT, DeepSpeed-Inference)
Familiarity with observability stacks (Prometheus, Grafana, Loki, ELK, OTEL)
Understanding of networking security and compliance frameworks (e.g., SOC 2)
Experience supporting on-prem or hybrid-cloud deployments

Benefits

A front-row seat to the generative AI infrastructure revolution
Competitive compensation and benefits package
Daily lunch and dinner provided; unlimited snacks and beverages
Health check-up and top-tier hardware support
Flexible working hours and a highly collaborative environment
We offer competitive compensation, startup equity, health insurance, and other benefits.

Company

FriendliAI

twittertwitter
company-logo
FriendliAI is an AI infrastructure company that enables deployment, scaling, and monitoring of large language and multimodal models.

Funding

Current Stage
Early Stage
Total Funding
$26.75M
Key Investors
Capstone Partners
2025-08-28Seed· $20M
2021-12-15Seed· $6.75M

Leadership Team

leader-logo
Byung-Gon Chun
Chief Executive Officer
linkedin
leader-logo
Gyeong-In Yu
Chief Technology Officer
linkedin
Company data provided by crunchbase