Voltage Park · 3 weeks ago
Infrastructure Operations Engineer
Voltage Park is your enterprise AI factory, offering scalable compute power and bare metal AI infrastructure. We are seeking a highly skilled Infrastructure Operations Engineer to ensure the stability and performance of our compute, storage, and platform infrastructure, supporting AI/ML training and HPC workloads at scale.
AI InfrastructureCloud ComputingMachine Learning
Responsibilities
At the direction of the Manager of Infrastructure Operations, design, build, and roll out new platforms and patterns to minimize incidents and enable customer facing and internal features
Deploy updates and improvements to support both Voltage Park’s internal and end customer use cases
Collaborate with colleagues in Infrastructure Engineering, Network Operations, Customer Success and Software and Platform Development Teams
Participate in the on-call rotation which is evenly distributed across all team members in a primary / secondary pattern where you are primary then move to a secondary position
Qualification
Required
8+ years working with Linux as a server / hosting platform, extra points for Ubuntu experience
5+ years experience with AWS
2+ years experience with Kubernetes and strong container fundamentals
2+ years experience with Terraform and Ansible
2+ years with network attached storage management (via NFS, ceph, or other protocols). Extra points for experience with VAST storage systems
Experience working in a Slack-first, asynchronous remote work environment
Experience with monitoring systems (Prometheus, ELK stack)
Familiarity with the gitops workflow
Software development experience using Python, Go, bash, or other languages for the purposes of automation & connecting systems & APIs together
Deep networking fundamentals, extra points for experience with datacenter level networks, 400Gb ethernet, and Infiniband
Experience building and delivering complex systems
Effective at navigating tradeoffs between design, risk, cost, and outcomes
Comfortable with navigating ambiguity
Strong written and oral communication
Preferred
Experience with bare metal hardware troubleshooting and provisioning, extra points for working with Dell hardware
Experience with GPU servers, both in bare metal form or under virtualization
Deep experience with network switches, routers, and firewalls, particularly SONiC switches, Palo Alto firewalls and Juniper Networks as vendors
Experience with VAST storage systems
Company
Voltage Park
Voltage Park provides infrastructure for machine learning.
Funding
Current Stage
Growth StageTotal Funding
$500M2023-10-30Undisclosed· $500M
Recent News
2025-10-21
2025-09-23
Company data provided by crunchbase