Principal Software Developer, GPU/AI and Compute Platforms jobs in United States
cer-icon
Apply on Employer Site
company-logo

Oracle · 3 months ago

Principal Software Developer, GPU/AI and Compute Platforms

Oracle is seeking a highly driven Principal Software Developer to join their Cloud Infrastructure development team, focusing on GPU and AI platforms. The role involves hardware development oversight, system integration, and collaboration with internal and external partners to enhance Oracle's Cloud AI solutions.

Data GovernanceData ManagementEnterprise SoftwareInformation TechnologySaaSSoftware
check
H1B Sponsor Likelynote

Responsibilities

Review and assessment of third-party merchant silicon
Evaluation of system architecture and proposed implementation path analysis
You will participate in platform definition and analysis
Provide platform development oversight for partners
Work with in-house engineering functional experts on design and reviews
Support system integration, performance testing, debug and characterization
Support program managers on technical assessments
You will interact closely with third party GPU IC suppliers & partners as well as internal hardware, software development, quality assurance, cloud orchestration, hardware and software security experts, and Oracle manufacturing teams
You will document and specify design intent and design details where appropriate in collaboration with the appropriate engineering teams
Participate in hardware platform security evaluations
Guide partner internal Oracle teams on support needed to scale, monitor, and successfully deploy our products to the Cloud
You will assist Oracle Cloud and Support teams in the root-cause of potential hardware or software bugs through firsthand lab replication debug, remote debug, and calls with the appropriate teams supporting our deployed products
Work with Oracle manufacturing teams to ensure that Oracle hardware is secure, robustly evaluated, performing at peak capabilities and well qualified for deployment to our Cloud customers
Work directly with hardware design and development teams on architecture, implementation, development, deployment, and troubleshooting of AI hardware platforms. Collaboration is also expected with the wider Oracle engineering and operations functional groups as well as our external partners
Develop, implement, and run the day-to-day execution of AI platform development, both internally and in partnership with third-party design teams. Including reviews of design plans, schematics, board layout, test feature definition / guidance for subsystem test, as well as System validation plans. Work on system and hardware integration, system test and qualification, work with software diagnostics engineers to test functionality, and utilize third party as well as approved open-source AI platform qualification test tools. Add to a roster of system characterization and performance testing capabilities and support definition of in-service system monitoring and error reporting needs
Work closely and collaborate with hardware developers, System architects, System engineers, technical leads, platform firmware developers, partners and AI chip / GPU suppliers, storage, networking and compute experts, on product development and then with Manufacturing and external suppliers assisting across the new product introduction process out to production. You will also serve as one of the last level of engineering technical support when cloud and support teams require guidance and help in resolving complex deployed product issues

Qualification

GPU hardware developmentAI platform architectureFirmware diagnostics toolsBoard ECAD toolsDebugging complex issuesScripting for testsHigh-speed buses knowledgeProblem isolationCollaboration with partnersCommunication skills

Required

Technical hands-on experience with market leading GPU (or alternate AI platforms) from the hardware and platform development, test, and characterization perspectives
Good knowledge of AI / GPU platform architecture and their capabilities
A strong understanding and experience running firmware and system diagnostics tools using BMC firmware, UEFI/ BIOS and Linux tools. Skilled in scripting to customize tests
Demonstrated working experience with GPU supplier test code as well as open-source AI test / characterization tools
Experience with design, and implementation of modern server platforms consisting of multiple architectures and vendors, including x86 and ARM server architectures
Experience with hardware development at the board, and FPGA level
Required experience with board ECAD level tools and ability to reviews hierarchical schematics, multilayer advance board layout, cross board interconnect and end-to-end connectivity analysis
Strong communications skills and ability to clearly communicate complex technical issue across engineering disciplines as well as clearly and succinctly articulate issues for executives
Demonstrated experience debugging and root-causing complex issues that may have a mix of hardware and software causes
Experience with early stage bring-up and power-on, platform firmware debugging, prototype GPU & CPU complex and memory complex debugging
An ability to isolate a problem to the source and the required creativity & expertise to devise timely and robust solutions
Experience and understanding of the latest high-speed busses and interconnect used in modern Compute and AI platforms. Familiarity with their startup connectivity and operational robustness

Preferred

Demonstrated knowledge of 'low-level' hardware component interfaces, including, but not limited to, e.g.: PCIe, SPI, I2C (incl. SMBus, PMBus), LPC, eSPI, etc
Comfortable with the use of hardware debuggers, O'Scopes, and advanced Signal characterization measurement tools
Experience with platform level security technologies present an advantage in the role

Company

Oracle is an integrated cloud application and platform services that sells a range of enterprise information technology solutions.

H1B Sponsorship

Oracle has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1271)
2024 (846)
2023 (995)
2022 (1192)
2021 (985)
2020 (755)

Funding

Current Stage
Public Company
Total Funding
$25.75B
Key Investors
Sequoia Capital
2025-09-24Post Ipo Debt· $18B
2025-02-03Post Ipo Debt· $7.75B
1986-03-12IPO

Leadership Team

leader-logo
Esteban Rubens
Healthcare Field CTO
linkedin
G
Gerard Warrens
Field CTO, Business Strategy and Transformative Technologies
linkedin
Company data provided by crunchbase