Weights & Biases · 8 hours ago
AI Engineer- Gen AI/SWE- Weights & Biases
Weights & Biases is part of CoreWeave, the AI Hyperscaler™, which aims to empower developers with tools and infrastructure for AI. The AI Engineer role involves designing, implementing, and evaluating LLM applications and agents, focusing on application rather than novel research, and ensuring responsible deployment and reproducibility.
Artificial Intelligence (AI)Data VisualizationDeveloper ToolsGenerative AIMachine Learning
Responsibilities
Ship end-to-end GenAI workflows (prompting → RAG → tools/agents → eval → serve) with reproducible repos, W&B Reports, and dashboards others can run
Build agentic systems (tool use, function calling, multi-step planners) with MCP servers/clients and secure tool/resource integrations
Design evaluation harnesses (RAG/agent evals, golden sets, regression tests, telemetry) and drive continuous improvement via offline + online metrics
Build in public: Publish engineering artifacts (code, docs, talks, tutorials) and engage with OSS and customer engineers; turn repeated patterns into reusable templates
Partner with product/solutions to launch LLM-powered features with clear latency/cost/SLO targets and safety/guardrail checks
Run growth experiments to track the usage of the Weights & Biases suite of products from the artifacts built
Qualification
Required
Software engineering: 6+ years building production systems; strong Python or TypeScript + system design, testing, CI/CD, observability
GenAI apps: shipped LLM-powered features (tools/agents/function calling), with measurable impact (latency/cost/reliability)
Agentic patterns: implemented planners/executors, tool orchestration, sandboxing, and failure taxonomies; familiarity with agent infra concerns
RAG: pragmatic mastery of chunking, embeddings, vector/hybrid search, rerankers; experience with vector DBs/search indices and retrieval policy design
Evaluation: designed LLM/RAG/agent evals (offline golden sets, counterfactuals, user studies, guardrail tests); stats literacy (variance, CIs, power)
Serving & productization: comfortable with queueing, caching, streaming, and cost controls; can debug latency at model, retrieval, and network layers
Public signal: 2+ substantial OSS repos/blog posts/talks/videos with adoption (stars, forks, downloads, views) and reproducible artifacts
Preferred
Experience building with AI SDKs / agent frameworks (e.g., TypeScript/Python SDKs, planning libraries) and shipping developer-facing examples
Production agent security/sandboxing, red-teaming, and policy/PII enforcement
Operated eval platforms or built judge models/heuristics; experience leading metrics reviews with product/UX
Customer-facing enablement: templates or reference implementations adopted by external teams at scale
Benefits
Medical, dental, and vision insurance - 100% paid for by CoreWeave
Company-paid Life Insurance
Voluntary supplemental life insurance
Short and long-term disability insurance
Flexible Spending Account
Health Savings Account
Tuition Reimbursement
Ability to Participate in Employee Stock Purchase Program (ESPP)
Mental Wellness Benefits through Spring Health
Family-Forming support provided by Carrot
Paid Parental Leave
Flexible, full-service childcare support with Kinside
401(k) with a generous employer match
Flexible PTO
Catered lunch each day in our office and data center locations
A casual work environment
A work culture focused on innovative disruption
Company
Weights & Biases
Weights & Biases is a developer-first MLOps platform that builds machine learning performance visualization tools.
Funding
Current Stage
Growth StageTotal Funding
$250MKey Investors
NVIDIAInsight PartnersCoatue
2025-03-04Acquired
2023-09-01Secondary Market
2023-08-09Series Unknown· $50M
Recent News
2025-11-04
2025-11-01
Company data provided by crunchbase