Member of Technical Staff - Model Serving / API Backend Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Black Forest Labs · 2 months ago

Member of Technical Staff - Model Serving / API Backend Engineer

Black Forest Labs is a pioneering company in generative AI, known for their FLUX models that power creative tools and products worldwide. They are seeking a Model Serving / API Backend Engineer to bridge the gap between research breakthroughs and production reality, focusing on optimizing model inference and developing reliable APIs for machine learning models.

Computer Software

Responsibilities

Develops and maintains robust APIs for serving machine learning models at scale—because reliability matters when millions depend on your endpoints
Transforms research models into production-ready demos and MVPs that showcase capabilities without pretending research prototypes are production systems
Optimizes model inference for improved performance and scalability using whatever techniques work—batching, quantization, custom kernels, compiler optimizations
Implements and manages user preference data acquisition systems that help us understand what actually works in production
Ensures high availability and reliability of model serving infrastructure—because downtime means users can't create
Collaborates with ML researchers to rapidly prototype and deploy new models, moving from research checkpoint to API endpoint faster than seems reasonable

Qualification

PythonRESTful API developmentModel inference optimizationContainerization technologiesCloud platformsRapid ML prototypingDistributed task queuesMonitoringObservabilityFrontend development frameworksMLOps practicesDatabase systemsA/B testingSecurity best practicesReal-time inference systemsCI/CD pipelinesML inference optimizations

Required

Strong proficiency in Python and its ecosystem for machine learning, data analysis, and web development
Extensive experience with RESTful API development and deployment for ML tasks—you've built APIs that real products depend on
Familiarity with containerization and orchestration technologies (Docker, Kubernetes) for deploying ML services at scale
Knowledge of cloud platforms (AWS, GCP, or Azure) for deploying and scaling ML services in production
Proven track record in rapid ML model prototyping using tools like Streamlit or Gradio—because demos matter for showing what's possible
Experience with distributed task queues and scalable model serving architectures that handle variable load
Understanding of monitoring, logging, and observability best practices for ML systems—because you can't fix what you can't see

Preferred

Have experience with frontend development frameworks (Vue.js, Angular, React) for building compelling demos
Bring familiarity with MLOps practices and tools
Know database systems and data streaming technologies
Have experience with A/B testing and feature flagging in production environments
Understand security best practices for API development and ML model serving
Have built real-time inference systems with low-latency optimizations
Know CI/CD pipelines and automated testing for ML systems
Bring expertise in ML inference optimizations including: Reducing initialization time and memory requirements, Implementing dynamic batching, Utilizing reduced precision and weight quantization, Applying TensorRT optimizations, Performing layer fusion and model compilation, Writing custom CUDA code for performance enhancements

Company

Black Forest Labs

twitter
company-logo
We’re the leading frontier AI research lab, continuously building the most advanced technology that shapes the visual understanding of the world.

Funding

Current Stage
Early Stage
Company data provided by crunchbase