CoreWeave · 22 hours ago
Senior Systems Engineer, OS Automation
CoreWeave is The Essential Cloud for AI™, providing a platform that enables innovators to build and scale AI with confidence. The Senior Systems Engineer will stabilize and scale Linux OS and Kernel build pipelines, transitioning to AI-native infrastructure by developing intelligent workflows that can autonomously address issues.
AI InfrastructureArtificial Intelligence (AI)Cloud ComputingCloud InfrastructureInformation TechnologyMachine Learning
Responsibilities
Pipeline Architecture: Design, maintain, and automate reproducible OS image build pipelines for our massive fleet of GPU-accelerated servers
Kernel Distribution: Collaborate with kernel engineers to package, validate, and distribute custom Linux builds across Intel, AMD, and ARM architectures
Dependency Management: Build tooling to manage dependencies, versioning, and release workflows, ensuring hermetic builds
Telemetry & Metrics: Standardize the collection of build metrics to create a baseline for future AI modeling
"Smart" CI/CD & Auto-Remediation: Architect AI agents that ingest and analyze build logs in real-time. Develop systems that auto-triage errors, categorize failure patterns, and generate context-aware fix suggestions for engineering teams
Predictive Regression Modeling: Design ML workflows that utilize historical performance data to detect kernel and OS regressions (latency, throughput, stability) in staging environments before they impact production
Dynamic Kernel Tuning: Implement closed-loop feedback systems that analyze real-time system metrics and automatically suggest or apply sysctl parameter optimizations for specific customer workloads
Next-Gen ChatOps: Engineer LLM-driven interfaces for Slack/internal tools, enabling stakeholders to query build statuses, request log summaries, or provision resources using natural language commands
Qualification
Required
4+ years of professional experience in Linux Systems Engineering, Release Engineering, or DevOps
Deep knowledge of Linux internals (boot process, kernel modules, networking stack)
Experience with package management (Debian/Ubuntu) and build systems
Strong proficiency in Python (essential for the AI integration aspects of this role)
Demonstrable experience integrating API-based AI models (OpenAI, Anthropic, or local open-source models) into software workflows
Understanding of RAG (Retrieval-Augmented Generation) architectures for querying technical documentation or logs
Experience building event-driven automation (e.g., using webhooks to trigger analysis agents)
Familiarity with data structures required for vector search or time-series analysis
Preferred
Experience with Kubeflow or MLFlow
Background in High-Performance Computing (HPC)
Experience fine-tuning small language models (SLMs) for code or log analysis tasks
Benefits
Medical, dental, and vision insurance - 100% paid for by CoreWeave
Company-paid Life Insurance
Voluntary supplemental life insurance
Short and long-term disability insurance
Flexible Spending Account
Health Savings Account
Tuition Reimbursement
Ability to Participate in Employee Stock Purchase Program (ESPP)
Mental Wellness Benefits through Spring Health
Family-Forming support provided by Carrot
Paid Parental Leave
Flexible, full-service childcare support with Kinside
401(k) with a generous employer match
Flexible PTO
Catered lunch each day in our office and data center locations
A casual work environment
A work culture focused on innovative disruption
Company
CoreWeave
CoreWeave is a cloud-based AI infrastructure company offering GPU cloud services to simplify AI and machine learning workloads.
Funding
Current Stage
Public CompanyTotal Funding
$23.37BKey Investors
Jane Street CapitalStack CapitalCoatue
2025-12-08Post Ipo Debt· $2.54B
2025-11-12Post Ipo Debt· $1B
2025-08-20Post Ipo Secondary
Recent News
2026-01-08
2026-01-08
Company data provided by crunchbase