Forward Deployed Engineer - Cloud Infrastructure & Inference jobs in United States
cer-icon
Apply on Employer Site
company-logo

FriendliAI · 13 hours ago

Forward Deployed Engineer - Cloud Infrastructure & Inference

FriendliAI is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models. They are seeking a Forward Deployed Engineer to assist enterprises in deploying, scaling, and operating AI workloads, working directly with customers to implement production-grade applications.

Artificial Intelligence (AI)Generative AIInformation TechnologyInternetSaaSSoftware
Hiring Manager
Woojin Lee
linkedin

Responsibilities

Design and implement large-scale deployment architectures for LLM and multimodal inference
Deploy and manage containerized workloads across Kubernetes clusters
Diagnose production issues, such as performance bottlenecks, and implement temporary fixes as needed
Collaborate with customers’ DevOps teams to integrate FriendliAI’s infrastructure into their CI/CD workflows
Develop scripts, Helm charts, and Terraform modules that simplify repeated deployments
Contribute field insights to shape our platform reliability, observability, and scaling strategies
Lead workshops, technical sessions, or webinars to help customers master infrastructure best practices

Qualification

KubernetesDockerTerraformCloud infrastructureDevOpsAI toolingDistributed systemsPerformance tuningDebuggingNetworkingProblem-solvingCollaboration

Required

3+ years of experience in cloud infrastructure, DevOps, or reliability engineering
Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
Proficiency with Kubernetes, Docker, Terraform, and Helm
Strong foundation in distributed systems, networking, and performance tuning
Familiarity with GPU-based computing and model serving workloads
Strong technical background in backend systems or AI tooling
Experience operating workloads on AWS, GCP, or OCI
Excellent problem-solving and debugging skills in real-world environments

Preferred

Experience deploying large models (LLMs, diffusion models) on GPUs or clusters
Familiarity with inference frameworks (Triton, vLLM, TensorRT, DeepSpeed-Inference)
Familiarity with observability stacks (Prometheus, Grafana, Loki, ELK, OTEL)
Understanding of networking security and compliance frameworks (e.g., SOC 2)
Experience supporting on-prem or hybrid-cloud deployments

Benefits

A front-row seat to the generative AI infrastructure revolution
Competitive compensation and benefits package
Daily lunch and dinner provided; unlimited snacks and beverages
Health check-up and top-tier hardware support
Flexible working hours and a highly collaborative environment

Company

FriendliAI

twittertwitter
company-logo
FriendliAI is an AI infrastructure company that enables deployment, scaling, and monitoring of large language and multimodal models.

Funding

Current Stage
Early Stage
Total Funding
$26.75M
Key Investors
Capstone Partners
2025-08-28Seed· $20M
2021-12-15Seed· $6.75M

Leadership Team

leader-logo
Byung-Gon Chun
Chief Executive Officer
linkedin
leader-logo
Gyeong-In Yu
Chief Technology Officer
linkedin
Company data provided by crunchbase