Senior Data Scientist II jobs in United States
cer-icon
Apply on Employer Site
company-logo

RELX · 1 month ago

Senior Data Scientist II

RELX is a global provider of information-based analytics and decision tools for professional and business customers. The role involves leading the design and evolution of a multimodal document understanding and structured data extraction platform, focusing on advanced AI technologies to enhance productivity in the legal market.

AnalyticsBusiness Information SystemsConsultingInformation ServicesInformation TechnologyInsuranceRisk Management
check
H1B Sponsor Likelynote

Responsibilities

Design and iterate the multimodal document parsing pipeline: layout / structural modeling, semantic extraction, cross‑modal alignment, structural reconstruction
Build and optimize a multi‑agent collaboration mechanism: task splitting, parallel / sequential scheduling, peer review, iterative quality improvement loops
Define model selection / composition / routing strategies (dynamic dispatch by document type, structural patterns, quality signals)
Plan and execute model fine‑tuning, domain adaptation, continual learning, active learning, and data feedback loops
Establish end‑to‑end metrics: extraction accuracy, structural consistency, agent collaboration effectiveness, latency, stability, and cost
Build quality assurance and risk controls: drift & anomaly monitoring, confidence estimation, fallback strategies, alignment / compliance checks
Drive mapping and consistency between agent / model outputs and business knowledge field standards

Qualification

Machine LearningDeep LearningMultimodal ModelsPythonStatistical AnalysisDocument UnderstandingImage ProcessingProblem DecompositionQuality AssuranceTeam Leadership

Required

Education: Master's degree or above in a quantitative or technical field (Statistics, Computer Science, Mathematics, Data Science, etc.)
Experience: 5+ years of hands‑on machine learning / data science experience. Proven delivery experience in multimodal (vision + text) or complex document understanding. Practical cases of orchestrating agents (or modular processing logic) in production workflows
Solid foundation in machine learning / deep learning fundamentals, multimodal representations, and cross‑modal alignment concepts
Deep understanding of core principles and common algorithms for multimodal large models: cross‑modal attention & representation alignment, vision/text embedding fusion, hierarchical & layout structure modeling, instruction & contrastive paradigms, long‑context and retrieval‑augmented mechanisms, evaluation and failure mode dissection
Familiar with classic image and signal processing methods: edge & contour detection, filtering & denoising, morphological operations, segmentation & key point feature extraction, frequency / time‑frequency analysis, image enhancement & quality assessment; understands trade‑offs and complementarity with deep features
Knowledge of multi‑agent collaboration patterns: role assignment, task routing, feedback loops, redundancy & cross‑checks. Strong in statistical analysis & experimental design: hypothesis testing, factorial design, power analysis, A/B and multivariate evaluation
Able to decompose complex problems and build metric‑driven optimization paths. Rigorous in data quality & error analysis; rapid bottleneck identification
Ability to translate research pseudo‑code into maintainable, testable Python modules with benchmarking & regression harnesses

Preferred

Designed customization / fine‑tuning of multimodal foundation models, representation learning, or structural understanding subsystems
Built an agent orchestration platform: task decomposition, iterative self‑checks, consensus or voting mechanisms
Experience solving robustness & generalization challenges in large‑scale long documents / heterogeneous layouts
Demonstrated results in cost optimization (model pruning, parameter‑efficient tuning, inference acceleration) or adaptive load scheduling
Publications / patents or open‑source contributions
Demonstrated Python systems optimization (e.g., custom Cython / CUDA kernels, vectorization replacing Python loops, latency reductions in inference pipelines)

Benefits

Health Benefits: Comprehensive, multi-carrier program for medical, dental and vision benefits
Retirement Benefits: 401(k) with match and an Employee Share Purchase Plan
Wellbeing: Wellness platform with incentives, Headspace app subscription, Employee Assistance and Time-off Programs
Short-and-Long Term Disability, Life and Accidental Death Insurance, Critical Illness, and Hospital Indemnity
Family Benefits, including bonding and family care leaves, adoption and surrogacy benefits
Health Savings, Health Care, Dependent Care and Commuter Spending Accounts
In addition to annual Paid Time Off, we offer up to two days of paid leave each to participate in Employee Resource Groups and to volunteer with your charity of choice

Company

RELX is a provider of information-based analytics for professional and business customs.

H1B Sponsorship

RELX has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (221)
2024 (187)
2023 (39)
2022 (30)
2021 (48)

Funding

Current Stage
Public Company
Total Funding
unknown
1994-10-14IPO

Leadership Team

leader-logo
Asim Fareeduddin
Head of Internal Audit & Assurance
linkedin
leader-logo
Aurobindo Sundaram
CISO (Head of Information Assurance & Data Protection)
linkedin
Company data provided by crunchbase