Huntech USA LLC · 3 days ago
Principal Platform RAS Software Systems Engineer
Huntech USA LLC is seeking a Principal Platform RAS Software Systems Engineer to create and deploy robust RAS and manageability solutions for cutting-edge compute platforms based on custom CPUs. The role involves providing technical leadership, collaborating with cross-functional teams, and ensuring the integration of RAS solutions while maintaining system health.
Responsibilities
Help design, integrate, and validate at‑scale RAS features for ARM‑based compute platforms and develop manageability solutions to monitor and maintain system health
Actively engage with the ARM and OCP community to stay updated on the latest developments and ensure alignment of the architected solution with community direction
Collaborate with cross‑functional teams, including SoC, CPU HW, HLOS, and BIOS software teams, to ensure seamless adoption and integration of the RAS solution by OEMs and hyperscalers
Provide technical leadership and oversight to various HW and SW teams, ensuring compliance with required ARM specifications
Collaborate with customers to guide and support the development of custom software solutions leveraging custom CPUs
Prepare and present clear and comprehensive technical documentation and reports for stakeholders, including engineering teams, senior management, customers, and suppliers
Partner with internal teams, marketing, end‑customers, OEMs, and suppliers to create software roadmaps and detailed requirement documentation
Qualification
Required
Master's degree in Computer Science/Engineering, Electrical Engineering, or a related field
10+ years of experience in designing software and firmware for various compute environments
Strong expertise in modern operating systems, ARM64 architectures, hypervisors, software reliability and manageability, and software development methodologies
Deep proficiency in Linux kernels, RAS, system manageability, DDR, PCIe, I2C, SPI, and MDIO
15+ years of experience in software development and design for commercially deployed compute platforms
In‑depth knowledge of ARM architectures for various compute environments and relevant BSA, BBR, and manageability specifications
Deep understanding of ARM RAS specification, ARM CPU RAS extensions, and software components (SDEI, APEI, UEFI CPER) specifications
Proven success in architecting and delivering solutions for a commercially deployed compute environment
Practical experience with in‑lab debugging tools
Strong technical documentation skills and excellent written and verbal communication