Data Scientist jobs in United States
cer-icon
Apply on Employer Site
company-logo

Sciforium · 1 day ago

Data Scientist

Sciforium is an AI infrastructure company developing next-generation multimodal AI models and a proprietary, high-efficiency serving platform. The Data Scientist will design, develop, and refine AI models while bridging the gap between theoretical research and production-grade performance.

Artificial Intelligence (AI)

Responsibilities

Model Architecture Design: Develop and experiment with novel architectures for LLMs and generative AI, focusing on maximizing performance-per-watt and training throughput
Large-Scale Training Execution: Lead the end-to-end training runs of foundation models, monitoring loss curves, stability, and convergence across massive multi-node clusters
Optimization & Scaling Laws: Apply scaling laws to predict model performance and optimize hyperparameters, tokenization strategies, and objective functions for trillion-parameter regimes
Data Engineering & Curation: Build and maintain sophisticated data pipelines that handle petabyte-scale pre-training datasets, ensuring high-quality signal through advanced filtering and deduplication
Algorithmic Profiling: Collaborate with the Training Engineering team to profile how specific model layers (e.g., Attention mechanisms, MoE layers) interact with GPU/accelerator memory and interconnects
Evaluation & Benchmarking: Design robust evaluation frameworks to measure model capability across reasoning, coding, and creative tasks, ensuring alignment with safety and performance standards
Cross-Functional Collaboration: Partner with Infrastructure and Kernel engineers to co-design features that improve training efficiency and model FLOPs utilization (MFU)

Qualification

Data ScienceMachine LearningDeep LearningPythonDistributed TrainingPyTorchJAXSparkRayLinear AlgebraCalculusOptimization

Required

5+ years of industry experience in Data Science, Machine Learning Research, or a closely related field, with a strong emphasis on deep learning
Bachelor's or Master's degree in Computer Science, Statistics, Mathematics, or another quantitative discipline
Expert-level Python skills with deep proficiency in PyTorch or JAX
Demonstrated experience training and deploying large-scale models (e.g., LLMs, diffusion models) in distributed production environments
Deep understanding of distributed training paradigms, including data parallelism, pipeline parallelism, and tensor parallelism
Strong mathematical foundation in linear algebra, calculus, and optimization, particularly as applied to neural network training and convergence
Experience working with data-at-scale tooling, such as Spark, Ray, or high-throughput data loading frameworks

Preferred

PhD in a relevant field, with publications at top-tier conferences (e.g., NeurIPS, ICML, ICLR)
Hands-on experience with Mixture-of-Experts (MoE) architectures, including routing and load-balancing challenges
Familiarity with RLHF workflows, including PPO and DPO fine-tuning pipelines
Knowledge of model quantization techniques (e.g., FP8, INT8, AWQ) and their impact on training stability and inference performance
Contributions to open-source ML libraries or involvement in high-profile LLM releases

Benefits

Medical, dental, and vision insurance
401k plan
Daily lunch, snacks, and beverages
Flexible time off
Competitive salary and equity

Company

Sciforium

twittertwitter
company-logo
Sciforium builds the next generation of AI models with unprecedented efficiency, privacy, and versatility.

Funding

Current Stage
Early Stage
Total Funding
$15.9M
2025-10-27Seed· $12M
2024-06-01Pre Seed· $3.9M
Company data provided by crunchbase