Wafer · 3 months ago
Member Of Technical Staff (Winter Intern)
Herdora is focused on building the future of inference, GPU optimization, and AI infrastructure. They are seeking a Winter Intern to work on scalable infrastructure for AI model training and inference while leading technical decisions and architecture choices.
Generative AIMachine LearningSoftwareSoftware Engineering
Responsibilities
Build scalable infrastructure for AI model training and inference
Lead technical decisions and architecture choices
Qualification
Required
Deep understanding of GPU architectures, CUDA programming, and parallel computing patterns
Proficiency in PyTorch, TensorFlow, or JAX, particularly for GPU-accelerated workloads
Strong grounding in large language models (training, fine-tuning, prompting, evaluation)
Proficiency in C++, Python, and possibly Rust/Go for building tooling around CUDA
Preferred
Publications or open-source contributions in inference GPU computing or ML/AI for code are a plus
Hands-on experience with large-scale experiments, benchmarking, and performance tuning
Company
Wafer
Latency-Optimized Inference. Custom-Built for Your Stack.
H1B Sponsorship
Wafer has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2020 (1)
Funding
Current Stage
Early StageTotal Funding
$0.8MKey Investors
Y Combinator
2025-07-17Pre Seed· $0.8M
Company data provided by crunchbase