Senior Web Scraping Engineer — Labrynth jobs in United States
cer-icon
Apply on Employer Site
company-logo

Infinity Constellation · 1 month ago

Senior Web Scraping Engineer — Labrynth

Infinity-Constellation is a Silicon Valley startup building next-generation Hermeneutical-Agent systems that leverage AI for complex regulations. The Senior Web Scraping Engineer will design and maintain large-scale data collection systems, utilizing various scraping tools and LLM techniques to enhance the robustness and quality of data extraction.

Artificial Intelligence (AI)Information TechnologySaaS
check
H1B Sponsor Likelynote

Responsibilities

Design, implement, and maintain web scraping pipelines for a wide variety of websites and data sources
Build scrapers using tools and frameworks such as Selenium, Playwright, BeautifulSoup, Scrapy (and similar libraries) with a focus on reliability, performance, and maintainability
Create automated workflows for scraping and data processing:
Containerize scraping jobs (e.g., using Docker)
Deploy and orchestrate them in the cloud (e.g., AWS, GCP, Azure)
Configure scheduling (e.g., run daily/weekly/hourly) and dependency management
Implement monitoring, alerting, and logging:
Capture detailed logs for each job run
Track job statuses and failures
Implement notifications/alerts when a scraper breaks or a website changes
Handle anti-bot measures (proxies, captchas, rate limits) and design scrapers that are resilient to layout and structure changes
Work closely with data engineering / product / ML teams to understand data requirements and ensure data quality
Utilize LLMs (Large Language Models) to:
Parse and extract structured information from messy HTML or semi-structured content
Increase robustness of scrapers to frequent UI/DOM changes
Prototype new scraping / extraction strategies using LLM APIs
Write clean, well-tested, and well-documented code, and contribute to best practices, code reviews, and tooling for the team
Continuously improve the scraping platform, including performance optimizations, standardization, and reusability of components

Qualification

PythonSeleniumWeb ScrapingCloud PlatformsLLMsETL/ELT PipelinesDockerHTML/CSS/JavaScriptProblem-SolvingCommunication Skills

Required

3+ years of professional experience working with web scraping or data collection at scale
Strong proficiency in Python and common scraping libraries/frameworks such as: Selenium, Playwright, BeautifulSoup, Scrapy (or similar)
Solid understanding of HTML, CSS, JavaScript, HTTP, and browser behavior
Experience building automated, production-grade workflows: Orchestrators / schedulers (e.g., Airflow, Prefect, Dagster, or similar)
Building ETL/ELT pipelines and integrating with databases, data warehouses, or storage (e.g., PostgreSQL, BigQuery, S3, GCS)
Hands-on experience with cloud platforms (AWS, GCP, or Azure), including: Deploying and running scheduled jobs
Managing infrastructure-as-code or similar deployment processes
Strong experience with logging, monitoring, and alerting: Ability to design logging for scraping jobs and to debug failures from logs
Familiarity with tools like CloudWatch, Stackdriver, ELK, Prometheus, Grafana, or similar
Experience with containers (Docker) and familiarity with CI/CD workflows
Exposure to LLMs (e.g., OpenAI, Anthropic, etc.) for tasks like parsing, information extraction, or automation
Strong problem-solving skills and the ability to debug complex, dynamic websites
Comfortable working in a fast-paced environment, with good communication skills in English

Preferred

Experience with Kubernetes or other container orchestration systems
Experience dealing with large-scale crawling, distributed scraping, and high-concurrency systems
Familiarity with handling CAPTCHAs, rotating proxies, and headless browsers at scale
Background in data engineering
Contributions to open-source web scraping tools or frameworks

Benefits

Fully remote, flexible hours
Payment in USD (contractor/freelance basis)

Company

Infinity Constellation

twittertwittertwitter
company-logo
Infinity Constellation is an AI-focused incubator that fully funds and supports founders to build fast-growing, service-based startups.

H1B Sponsorship

Infinity Constellation has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (9)

Funding

Current Stage
Early Stage
Total Funding
$12M
Key Investors
Freestyle Capital
2025-05-27Seed· $12M

Leadership Team

leader-logo
Francis Pedraza
President & Executive Chairman, CSO, CIO & CEO
linkedin
Company data provided by crunchbase