Software Engineer - Data Acquisition / Web Crawling jobs in United States
cer-icon
Apply on Employer Site
company-logo

xAI · 5 days ago

Software Engineer - Data Acquisition / Web Crawling

xAI is on a mission to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. As a Software Engineer specializing in Data Acquisition and Web Crawling, you will build and operate large-scale distributed systems to collect and process vast amounts of data, supporting the development of advanced AI models. This role requires strong engineering skills and a passion for tackling complex data challenges in a collaborative environment.

Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Building petabyte-scale, high-throughput data processing systems managing hundreds of petabytes to exabytes of data
Designing and operating large-scale distributed systems and pipelines processing hundreds of thousands to millions of operations per second
Managing workloads across large cloud compute clusters
Pre-processing datasets for AI training
Building and operating large-scale crawlers, gathering and communicating requirements clearly and concisely

Qualification

Distributed systems designData processing systemsCompiled languagesPerformance optimizationSQL/NoSQL databasesDebugging skillsData bookkeepingInternet knowledgeCommunication skills

Required

Strong engineering skills with a passion for improving different aspects of data and model performance
Strong proficiency in at least one compiled language: Rust, Go, C++, or Java
Has worked on one or more modalities other than text and demonstrated exceptional work
Building bespoke data processing libraries from scratch
Designing and implementing distributed systems in Rust
Keeping up with state-of-the-art techniques for preparing AI training data
Organizing and meticulously bookkeeping data across multiple clouds, of multiple modalities, and from many sources
Great debugging skills are a must
Must have deep knowledge of how the internet works, including DNS, OSI model, crawler architectures, challenges operating crawlers, and headless browsers

Preferred

Experience with performance optimization of large-scale systems is preferred
Experience with SQL/NoSQL databases, especially columnar databases, is a plus

Company

xAI

twittertwittertwitter
company-logo
XAI is an artificial intelligence startup that develops AI solutions and tools to enhance reasoning and search capabilities.

H1B Sponsorship

xAI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)

Funding

Current Stage
Late Stage
Total Funding
$42.73B
Key Investors
Neptune Digital AssetsSpaceXMorgan Stanley
2026-01-06Series E· $20B
2025-12-11Secondary Market· $0.3M
2025-07-13Corporate Round· $5.32B

Leadership Team

leader-logo
Toby Pohlen
Founding Member
linkedin
Company data provided by crunchbase