Hardware Systems Engineer, NPI AI jobs in United States
cer-icon
Apply on Employer Site
company-logo

Meta · 11 hours ago

Hardware Systems Engineer, NPI AI

Meta builds technologies that help people connect, find communities, and grow businesses. They are seeking a Hardware Systems Engineer to drive end-to-end system validation and lead the deployment of cutting-edge hardware systems in large-scale data center applications.

Computer Software
check
Comp. & Benefits

Responsibilities

Drive and execute end-to-end system validation strategy (hardware and software), with a focus on various AI/HPC hardware systems in datacenter applications
Lead the bring-up, validation, and deployment of cutting-edge hardware systems in large scale deployment with active hands-on participations
Explore new use cases with customer teams and identify related test methodologies/test cases accordingly
Investigate and troubleshoot complex failures potentially related to Hardware systems with cross-function teams, which may involve different stacks like silicon, firmware, software, etc
Triage failures and continue rootcausing while driving project development work forward
Identify gaps and opportunities to improve test process and test methodologies across the NPI space
Guide automation efforts and data analysis for NPI projects through engagement with related cross-function teams
Communicate project progress and assessments to related internal and external teams

Qualification

AI SiliconHPC architectureSilicon troubleshootingSystem validationFirmware validationTest specificationsLinux proficiencyDebugging toolsContinuous integrationCommunication skillsTeam collaboration

Required

8+ years of experience in hands-on SW, FW or HW engineering to build any of the following products (AI Silicon, GPUs, TPUs, Autonomous cars, AI servers)
Experience in one or more domains such as: ASIC development (Silicon design, bringup, characterization, validation), board level debug, firmware validation, system validation
Experience with leading Silicon or System troubleshooting and debugging
Experience in developing test specifications, procedures, and debug guides for test solutions

Preferred

Proficiency in High-Performance Computing (HPC) or AI system architecture at rack level and at scale
5+ years of experience with one or more of the following modules/domains: PCIe, NVlink, Networking, Flash, Memory, CPU, GPU, TPU, DRAM (DDR4/5 or HBM), AI silicon/AI accelerators
Hands-on experience in software, firmware, and hardware engineering to develop systems/products for datacenter applications such as video processing, AI/ML, and networking
Experience with definition of HW/SW interface requirements for Telemetry, Diagnostics, Debugging
Proficiency in Linux environment and server system management
Experience with debugging tools for SoCs (e.g., JTAG, GDB, Trace32) and knowledge of common bus protocols such as I2C, SPI, USB, and PCIe
Experience in using continuous integration and version control tools for system development and testing
Experience integrating lab tools for automated workflows and managing large-scale deployments

Benefits

Bonus
Equity
Benefits

Company

Meta's mission is to build the future of human connection and the technology that makes it possible.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Kathryn Glickman
Director, CEO Communications
linkedin
leader-logo
Christine Lu
CTO Business Engineering NA
linkedin
Company data provided by crunchbase