MeshyAI · 2 days ago
Junior Research Infrastructure Engineer
MeshyAI is the leading 3D generative AI company on a mission to unleash 3D creativity. They are seeking a Product-Minded Junior Research Infrastructure Engineer to design, build, and operate distributed data systems that support AI model training while also developing intuitive internal tools for researchers.
Computer Software
Responsibilities
Participate in the design and implementation of distributed task orchestration systems using Temporal or Celery
Architect pipelines across cloud object storage (S3, GCS), data lakes, and metadata catalogs
Implement partitioning, sharding, and caching strategies to ensure data processing pipelines are resilient, highly available, and consistent
Design, implement, and maintain distributed ingestion pipelines for structured and unstructured data (images, 3D/2D assets, binaries)
Build scalable ETL/ELT workflows to transform, validate, and enrich datasets for AI/ML model training and analytics
Support preprocessing of unstructured assets (e.g., images, 3D/2D models, video) for training pipelines, including format conversion, normalization, augmentation, and metadata extraction
Implement validation and quality checks to ensure datasets meet ML training requirements
Collaborate with ML researchers to quickly adapt pipelines to evolving pretraining and evaluation needs
Use infrastructure-as-code (Terraform, Kubernetes, etc.) to manage scalable and reproducible environments
Manage data assets using Databricks Asset Bundles (DABs) and build rigorous CI/CD pipelines (GitHub Actions)
Focus on maximizing cluster utilization (CPU/Memory) and optimizing EC2 instance allocation to aggressively reduce compute costs
Take ownership of the platform’s 'Interface' by building Data Explorers and management consoles using React or Next.js
Actively listen to researchers and data scientists to iterate on UI/UX based on their feedback
Simplify complex CLI operations into intuitive GUI interactions to boost overall developer experience (DevEx)
Qualification
Required
2+ years of experience in software engineering, backend development, or distributed systems
Strong programming skills in Python (plus Scala/Java/C++ a plus)
Familiarity with distributed frameworks (Spark, Dask, Ray) and cloud platforms (AWS/GCP/Azure)
Experience with workflow orchestration tools (Temporal, Celery, or Airflow)
Proficiency with Infrastructure as Code (Terraform) and CI/CD tools (GitHub Actions)
Experience building web applications or internal tools using React or Next.js
A 'product-first' mindset: an interest in how users interact with infrastructure and a desire to build clean, functional interfaces
Preferred
Experience handling large-scale unstructured datasets (images, video, binaries, or 3D/2D assets)
Familiarity with AI/ML training data pipelines, including dataset versioning, augmentation, and sharding
Exposure to computer graphics or 3D/2D data processing
Kubernetes (K8s) for distributed workloads and cluster orchestration
Data lakehouse platforms (specifically Databricks and DABs)
Familiarity with GPU-accelerated computing and HPC clusters
Experience with 3D/2D asset processing (geometry transformations, rendering pipelines)
Located in or near one of our employee hubs — Bay Area, CA; Seattle, WA
Benefits
Stock options available for core team members.
401(k) plan for employees.
Comprehensive health, dental, and vision insurance.
The latest and best office equipment.