Member of Technical Staff - Large scale data infrastructure jobs in United States
cer-icon
Apply on Employer Site
company-logo

Black Forest Labs · 1 month ago

Member of Technical Staff - Large scale data infrastructure

Black Forest Labs is pioneering advancements in generative AI with their FLUX models, which are utilized across various industries. They are seeking a Member of Technical Staff to develop and maintain scalable infrastructure for managing massive-scale image and video datasets, optimizing data retrieval, and ensuring efficient data management processes.

Computer Software

Responsibilities

Develops and maintains scalable infrastructure to store and retrieve massive-scale image and video datasets—the kind where 'large' means billions of assets, not millions
Optimizes data retrieval so that every training run can fully utilize all GPUs
Builds tooling to efficiently manage datasets
Manages and coordinates data transfers from licensing partners
Makes sure we are using our object storage as efficiently as possible

Qualification

PythonCloud object storageData loaders for MLAWSGCPAzureSlurm/HPCLarge scale image dataLarge scale video dataData manipulation

Required

Develops and maintains scalable infrastructure to store and retrieve massive-scale image and video datasets—the kind where 'large' means billions of assets, not millions
Optimizes data retrieval so that every training run can fully utilize all GPUs
Builds tooling to efficiently manage datasets
Manages and coordinates data transfers from licensing partners
Makes sure we are using our object storage as efficiently as possible

Preferred

Strong proficiency in Python and experience with various file systems for data-intensive manipulation and analysis
Experience building reliable and scalable data loaders for machine learning applications
Deep knowledge about cloud object storage and the challenges that go hand in hand with it
Hands-on familiarity with cloud object storage such as S3 and Azure Blob Storage, cloud platforms (AWS, GCP, or Azure) and Slurm/HPC environments for distributed data processing
Have created and managed storage infrastructure in the PB-scale before
Have worked with large scale image and video data before

Company

Black Forest Labs

twitter
company-logo
We’re the leading frontier AI research lab, continuously building the most advanced technology that shapes the visual understanding of the world.

Funding

Current Stage
Early Stage
Company data provided by crunchbase