TensorWave · 11 hours ago
Storage Engineer
TensorWave is dedicated to building seamless and reliable AI infrastructure. They are seeking a Storage Engineer to design and operate NFS-based storage systems, ensuring high performance and scalability under heavy workloads.
AI InfrastructureArtificial Intelligence (AI)Cloud ComputingCloud InfrastructureGenerative AIIaaS
Responsibilities
Design, deploy, and operate NFS-based storage systems for production workloads
Own and operate VAST Data and WEKA clusters in production environments
Architect storage for high-throughput, low-latency shared file access
Tune and optimize NFS performance (mount options, client behavior, server-side tuning)
Manage capacity planning, scaling, and rebalancing for VAST and WEKA systems
Diagnose and resolve storage performance issues (latency spikes, metadata bottlenecks, throughput drops)
Design and test failure and recovery scenarios (node failures, network issues, disk loss)
Lead upgrades, expansions, and maintenance with minimal or zero downtime
Partner with infrastructure and application teams to ensure workloads are well-matched to storage behavior
Document operational runbooks and establish best practices for shared file storage
Qualification
Required
Strong hands-on experience with NFS in production environments
Direct experience operating VAST Data and/or WEKA systems
Deep understanding of distributed file systems and shared storage architectures
Strong knowledge of storage performance fundamentals (latency, throughput, metadata operations)
Experience troubleshooting complex storage and networking interactions
Solid Linux systems knowledge, especially around filesystem and I/O behavior
Ability to reason about failure domains, recovery paths, and data integrity
Preferred
Experience supporting AI/ML, HPC, or data-intensive workloads
Familiarity with RDMA, high-speed networking, or NVMe-based storage
Kubernetes workloads backed by shared file system
Experience with multi-rack or multi-site storage deployments
Infrastructure-as-Code experience or automation experience
Benefits
100% paid Medical, Dental, and Vision insurance
Life and Voluntary Supplemental Insurance
Short Term Disability Insurance
Flexible Spending Account
401(k)
Flexible PTO
Paid Holidays
Parental Leave
Mental Health Benefits through Spring Health
Company
TensorWave
TensorWave is an AMD GPU exclusive Cloud that supports training and inference at scale
Funding
Current Stage
Growth StageTotal Funding
$146.71MKey Investors
Nexus Venture PartnersFundNV
2025-05-14Series A· $100M
2024-10-08Seed· $43M
2024-04-23Seed· $0.89M
Recent News
ReviewJournal
2025-12-19
2025-11-05
Company data provided by crunchbase