Apply on Employer Site

Hydra Host · 2 months ago

Storage Engineer at Hydra Host

Miami, United States

Full-time

Onsite

Senior Level, Lead/Staff

$200K/yr - $300K/yr

8+ years exp

Hydra Host is a Founders Fund-backed NVIDIA cloud partner building infrastructure for AI at scale. They are seeking a Storage Engineer to lead the architecture, development, and deployment of their next-generation AI/HPC storage platform, focusing on designing and building a production-grade storage system to support bare-metal GPU clusters.

Artificial Intelligence (AI)Cloud InfrastructureDeveloper APIsWeb Hosting

Responsibilities

Define, architect, and implement Hydra Host’s first production storage platform tailored for bare-metal GPU clusters and AI/HPC workloads

Lead all technical decisions around storage stack design, from hardware infrastructure to parallel file system orchestration and performance tuning

Select, build, and maintain storage solutions spanning both block (NVMe, SAN, Ceph, etc.) and object storage (S3-compatible, custom, or Ceph Object Gateway) layers

Design for high-throughput, low-latency access, supporting large datasets, rapid checkpointing, and parallel access for distributed AI training workloads

Integrate and optimize parallel file systems such as Lustre, BeeGFS, Spectrum Scale, WekaIO, or CephFS, ensuring maximum performance and fault tolerance

Ensure compatibility across Hydra’s diverse GPU/OEM ecosystem, accounting for unique firmware, BMC/Redfish APIs, and hardware configurations

Develop automation, observability, and management tooling for storage, focusing on reliability, scalability, and efficiency

Act as a builder and architect: deeply hands-on in deployment, troubleshooting, and optimization, while guiding long-term storage roadmap

Collaborate cross-functionally with GPU, HPC, and platform engineering teams to integrate storage with compute and network layers

Interface with customers and product leadership to define feature priorities, performance benchmarks, and future enhancements

Qualification

High-performance storage systemsBlock storage NVMeBlock storage SANObject storage S3Object storage CephParallel file systemsLinux systems engineeringAutomationScriptingAI/ML data pipelinesProblem-solving skillsCommunication skillsTechnical leadership

Required

8+ years of progressive, hands-on experience designing and implementing high-performance storage systems for compute clusters in HPC, AI, or bare-metal cloud environments

Proven track record building storage infrastructure from scratch, not just operating existing systems

Deep expertise in block storage (NVMe, SAN, Ceph, distributed block systems) and object storage (S3, MinIO, Ceph Object Gateway, etc.)

Strong background in parallel file systems (WekaIO, BeeGFS, Lustre, Spectrum Scale, or similar) supporting GPU or AI cluster workloads

Solid foundation in Linux systems engineering, automation, and scripting for distributed environments

Familiarity with BMC, Redfish APIs, and OEM server firmware for bare-metal management

Deep understanding of AI/ML data pipelines: model checkpointing, data locality, and multi-tiered storage optimization

Excellent problem-solving, debugging, and communication skills, able to translate technical decisions into clear architectural direction

Preferred

Experience building storage solutions for large-scale GPU or HPC infrastructure

History of technical leadership or mentorship, growing teams or owning a product roadmap

Experience evaluating and managing vendor relationships and negotiating storage hardware/software contracts

Contributions to open-source HPC or storage projects (Ceph, Lustre, BeeGFS, etc.)

Familiarity with confidential computing, secure data handling, or high-availability architectures

Company

Hydra Host

Hydra offers a bare metal GPU platform, connecting businesses to a vareity of independent but standardized AI Factory Franchises.

Founded in 2021

Miami, Florida, USA

11-50 employees

https://www.hydrahost.com

Funding

Current Stage

Early Stage

Total Funding

$10M

Key Investors

Flume VenturesFounders Fund

2025-02-10Seed

2024-09-12Seed

2022-04-06Seed· $10M

Leadership Team

Aaron Ginn

CEO & Co-Founder

Garrett Johnson

Co-Founder & COO

Recent News

PR Newswire

Ornn Compute Exchange and Hydra Host Partner to Financialize Compute

2025-10-23

Benzinga.com

Nvidia B300 Chips Order Touted By Nayib Bukele's Bitcoin Office — Why Does El Salvador Want These Powerful AI Processors?

2025-07-02

NVIDIA CORPORATION

NVIDIA DGX Cloud Lepton Connects Europe’s Developers to Global NVIDIA Compute Ecosystem

2025-06-11

Company data provided by crunchbase