TensorWave · 4 months ago
Kubernetes Platform Engineer
TensorWave is leading the charge in AI compute, building a versatile cloud platform driving the next generation of AI innovation. As a Kubernetes Platform Engineer, you will maintain the stability and reliability of bare-metal Kubernetes infrastructure while collaborating with senior engineers on troubleshooting and operations across multi-tenant workloads.
AI InfrastructureArtificial Intelligence (AI)Cloud ComputingCloud InfrastructureGenerative AIIaaS
Responsibilities
Own and troubleshoot operational issues within Kubernetes environments
Maintain and monitor core services (e.g., Cilium, HAProxy, Prometheus, etc.)
Ensure uptime, performance, and reliability of multi-tenant clusters
Assist with Ingress/Egress connectivity and network debugging
Support internal and customer teams in secure, isolated VPC environments
Collaborate with senior engineers on automation and cluster lifecycle improvements
Qualification
Required
2–4 years experience in DevOps, SRE, or Linux infrastructure roles
1+ years of hands-on experience with Kubernetes in production
Familiarity with networking, CNI plugins, and core Linux troubleshooting
Strong infrastructure-as-code mindset using tools like Helm, Terraform, or Ansible
Solid experience with monitoring and logging tools (e.g., Prometheus, Grafana, Loki)
Understanding of secure infrastructure design principles and least-privilege access
Comfortable working in a team-oriented, fast-paced operational environment
Preferred
Experience with RKE2, Rancher, or similar platforms
Experience troubleshooting or supporting AI or GPU-based workloads
Familiarity with HAProxy, Cilium, or other Kubernetes ingress/networking tools
Benefits
Stock Options
100% paid Medical, Dental, and Vision insurance
Life and Voluntary Supplemental Insurance
Short Term Disability Insurance
Flexible Spending Account
401(k)
Flexible PTO
Paid Holidays
Parental Leave
Mental Health Benefits through Spring Health
Company
TensorWave
TensorWave is an AMD GPU exclusive Cloud that supports training and inference at scale
Funding
Current Stage
Growth StageTotal Funding
$146.71MKey Investors
Nexus Venture PartnersFundNV
2025-05-14Series A· $100M
2024-10-08Seed· $43M
2024-04-23Seed· $0.89M
Recent News
ReviewJournal
2025-12-19
2025-11-05
Company data provided by crunchbase