Apply on Employer Site

Black Forest Labs · 2 months ago

Member of Technical Staff - Model Serving / API Backend Engineer

United States

Full-time

Remote

Mid, Senior Level

$180K/yr - $300K/yr

Black Forest Labs is a pioneering company in generative AI, known for their FLUX models that power creative tools and products worldwide. They are seeking a Model Serving / API Backend Engineer to bridge the gap between research breakthroughs and production reality, focusing on optimizing model inference and developing reliable APIs for machine learning models.

Computer Software

Responsibilities

Develops and maintains robust APIs for serving machine learning models at scale—because reliability matters when millions depend on your endpoints

Transforms research models into production-ready demos and MVPs that showcase capabilities without pretending research prototypes are production systems

Optimizes model inference for improved performance and scalability using whatever techniques work—batching, quantization, custom kernels, compiler optimizations

Implements and manages user preference data acquisition systems that help us understand what actually works in production

Ensures high availability and reliability of model serving infrastructure—because downtime means users can't create

Collaborates with ML researchers to rapidly prototype and deploy new models, moving from research checkpoint to API endpoint faster than seems reasonable

Qualification

PythonRESTful API developmentModel inference optimizationContainerization technologiesCloud platformsRapid ML prototypingDistributed task queuesMonitoringObservabilityFrontend development frameworksMLOps practicesDatabase systemsA/B testingSecurity best practicesReal-time inference systemsCI/CD pipelinesML inference optimizations

Required

Strong proficiency in Python and its ecosystem for machine learning, data analysis, and web development

Extensive experience with RESTful API development and deployment for ML tasks—you've built APIs that real products depend on

Familiarity with containerization and orchestration technologies (Docker, Kubernetes) for deploying ML services at scale

Knowledge of cloud platforms (AWS, GCP, or Azure) for deploying and scaling ML services in production

Proven track record in rapid ML model prototyping using tools like Streamlit or Gradio—because demos matter for showing what's possible

Experience with distributed task queues and scalable model serving architectures that handle variable load

Understanding of monitoring, logging, and observability best practices for ML systems—because you can't fix what you can't see

Preferred

Have experience with frontend development frameworks (Vue.js, Angular, React) for building compelling demos

Bring familiarity with MLOps practices and tools

Know database systems and data streaming technologies

Have experience with A/B testing and feature flagging in production environments

Understand security best practices for API development and ML model serving

Have built real-time inference systems with low-latency optimizations

Know CI/CD pipelines and automated testing for ML systems

Bring expertise in ML inference optimizations including: Reducing initialization time and memory requirements, Implementing dynamic batching, Utilizing reduced precision and weight quantization, Applying TensorRT optimizations, Performing layer fusion and model compilation, Writing custom CUDA code for performance enhancements

Company

Black Forest Labs

We’re the leading frontier AI research lab, continuously building the most advanced technology that shapes the visual understanding of the world.

Founded in 2024

11-50 employees

https://blackforestlabs.ai/

Funding

Current Stage

Early Stage

Company data provided by crunchbase