DDN · 2 months ago
Staff Software Engineer – Infinia L4
DataDirect Networks (DDN) is a global leader in AI and multi-cloud data management at scale, renowned for powering demanding AI data centers. The Staff Software Engineer – Infinia L3 will be responsible for addressing complex issues in enterprise environments, collaborating with various teams to improve product resiliency and enhance customer support experiences.
AI InfrastructureArtificial Intelligence (AI)Big DataData StorageEnterprise SoftwarePredictive Analytics
Responsibilities
Own critical customer case escalations end-to-end, including deep root cause analysis and mitigation strategies
Act as one of the technical escalation points for Infinia incidents — especially in production-impacting scenarios
Lead war rooms, live incident bridges, and cross-functional response efforts with other engineering, QA, and Field teams
Utilize AI-powered debugging, log analysis, and system pattern recognition tools to accelerate resolution
Become a subject-matter expert on Infinia internals: metadata handling, storage fabric interfaces, performance tuning, AI integration, etc
Reproduce complex customer issues and propose product improvements or workarounds
Author and maintain detailed runbooks, performance tuning guides, and RCA documentation
Feed real-world support insights back into the development cycle to improve reliability and diagnostics
Partner with Field CTOs, Solutions Architects, and Sales Engineers to ensure customer success
Translate technical issues into executive-ready summaries and business impact statements
Participate in post-mortems and executive briefings for strategic accounts
Drive adoption of observability, automation, and self-healing support mechanisms using AI/ML tools
Delivering training to customers support and field engineering
Qualification
Required
8+ years in enterprise storage, distributed systems, or cloud infrastructure support/engineering
Deep understanding of file systems (S3, POSIX, NFS), storage performance, and Linux kernel internals
Proven debugging skills at system/protocol/app levels (e.g., strace, tcpdump, perf)
Hands-on experience with troubleshooting on Linux
Exposure to RDMA, NVMe-oF, or high-performance networking stacks
Exceptional communication and executive reporting skills
Experience using AI tools (e.g., log pattern analysis, LLM-based summarisation, automated RCA tooling) to accelerate diagnostics and reduce MTTR
Preferred
Experience with DDN, VAST, Weka, or similar scale-out file systems
Strong scripting/coding ability in Python, Bash, or Go
Familiarity with observability platforms: Prometheus, Grafana, ELK, OpenTelemetry
Knowledge of replication, consistency models, and data integrity mechanisms
Exposure to Sovereign AI, LLM model training environments, or autonomous system data architectures
Company
DDN
DDN (DataDirect Networks) is the world’s leading AI and data intelligence company, empowering organizations to maximize the value of their data with end-to-end HPC and AI-focused solutions.
H1B Sponsorship
DDN has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (5)
2024 (3)
2023 (2)
2022 (4)
2021 (3)
Funding
Current Stage
Late StageTotal Funding
$309.9MKey Investors
Blackstone Group
2025-01-09Private Equity· $300M
2002-01-02Series A· $9.9M
Recent News
2026-01-15
Business Wire
2026-01-13
2026-01-06
Company data provided by crunchbase