Software Engineer - Data Acquisition / Web Crawling jobs in United States
cer-icon
Apply on Employer Site
company-logo

xAI · 1 day ago

Software Engineer - Data Acquisition / Web Crawling

xAI is dedicated to creating AI systems that can understand the universe and aid humanity. The Software Engineer specializing in Data Acquisition and Web Crawling will build and operate large-scale systems to collect and process vast amounts of data, collaborating with various teams to meet their data needs and push the boundaries of data engineering.

Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Building petabyte-scale, high-throughput data processing systems managing hundreds of petabytes to exabytes of data
Designing and operating large-scale distributed systems and pipelines processing hundreds of thousands to millions of operations per second
Managing workloads across large cloud compute clusters
Pre-processing datasets for AI training
Building and operating large-scale crawlers, gathering and communicating requirements clearly and concisely

Qualification

Distributed systems designData processing systemsCompiled languagesPerformance optimizationSQL/NoSQL databasesDebugging skillsCuriosityCommunication skillsWork ethic

Required

Strong engineering skills with a passion for improving different aspects of data and model performance
Strong proficiency in at least one compiled language: Rust, Go, C++, or Java
Has worked on one or more modalities other than text and demonstrated exceptional work
Building bespoke data processing libraries from scratch
Designing and implementing distributed systems in Rust
Keeping up with state-of-the-art techniques for preparing AI training data
Organizing and meticulously bookkeeping data across multiple clouds, of multiple modalities, and from many sources
Great debugging skills are a must
Must have deep knowledge of how the internet works, including DNS, OSI model, crawler architectures, challenges operating crawlers, and headless browsers

Preferred

Experience with performance optimization of large-scale systems is preferred
Experience with SQL/NoSQL databases, especially columnar databases, is a plus

Company

xAI

twittertwittertwitter
company-logo
XAI is an artificial intelligence startup that develops AI solutions and tools to enhance reasoning and search capabilities.

H1B Sponsorship

xAI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)

Funding

Current Stage
Late Stage
Total Funding
$42.73B
Key Investors
Neptune Digital AssetsSpaceXMorgan Stanley
2026-01-06Series E· $20B
2025-12-11Secondary Market· $0.3M
2025-07-13Corporate Round· $5.32B

Leadership Team

leader-logo
Greg Yang
Co-Founder
linkedin
leader-logo
Yuhuai Wu
Co-Founder
linkedin
Company data provided by crunchbase