Apply on Employer Site

Wafer · 3 months ago

Member Of Technical Staff (Winter Intern)

San Francisco, CA

Internship

Onsite

Intern

Herdora is focused on building the future of inference, GPU optimization, and AI infrastructure. They are seeking a Winter Intern to work on scalable infrastructure for AI model training and inference while leading technical decisions and architecture choices.

Generative AIMachine LearningSoftwareSoftware Engineering

H1B Sponsor Likely

Responsibilities

Build scalable infrastructure for AI model training and inference

Lead technical decisions and architecture choices

Qualification

GPU FundamentalsDeep Learning FrameworksLLM/AI KnowledgeSystems EngineeringC++PythonRustGo

Required

Deep understanding of GPU architectures, CUDA programming, and parallel computing patterns

Proficiency in PyTorch, TensorFlow, or JAX, particularly for GPU-accelerated workloads

Strong grounding in large language models (training, fine-tuning, prompting, evaluation)

Proficiency in C++, Python, and possibly Rust/Go for building tooling around CUDA

Preferred

Publications or open-source contributions in inference GPU computing or ML/AI for code are a plus

Hands-on experience with large-scale experiments, benchmarking, and performance tuning

Company

Wafer

Latency-Optimized Inference. Custom-Built for Your Stack.

Founded in 2025

San Francisco, California, USA

2-10 employees

https://www.herdora.com/

H1B Sponsorship

Wafer has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2020 (1)

Funding

Current Stage

Early Stage

Total Funding

$0.8M

Key Investors

Y Combinator

2025-07-17Pre Seed· $0.8M

Company data provided by crunchbase