Apply on Employer Site

Straive · 9 hours ago

Data Scientist

San Francisco Bay Area

Full-time

Onsite

Mid, Senior Level

4+ years exp

Straive is a global leader in enterprise-grade data analytics and AI solutions, committed to empowering businesses with cutting-edge technology. They are seeking a Data Scientist to focus on model reproduction, feature engineering, and performance validation, ensuring alignment with client modeling frameworks.

CRMOutsourcing

Hiring Manager

Jeffrey Evans M Victor

Responsibilities

Rebuild and port existing Client's Python based models into customer’s Databricks platform

Develop, train, and validate predictive models using Python, PySpark, and ML frameworks such as scikitlearn, XGBoost, and Spark MLlib

Develop, validate and reproduce feature engineering logic and ensure parity with Client's models

Train, retain, validate, and benchmark model performance using customer provided datasets while maintaining performance parity with baseline models

Work with data engineers to define feature requirements and ensure datasets support model needs

Perform model diagnostics, bias checks, stability checks, and accuracy assessments

Prepare model documentation, validation summaries, and stakeholder ready insights

Support scoring pipeline design and ensure reproducibility across Dev/QA/Prod

Collaborate with compliance and platform teams to ensure adherence to governance

Perform model diagnostics, hyperparameter tuning, and stability analysis

Evaluate model performance across population segments and time periods

Work with platform and engineering teams to support scoring pipeline deployment across Dev/QA/Prod

Qualification

PythonMachine LearningDatabricksFeature EngineeringSQLCloud PlatformsStatistical AnalysisDocumentationCollaboration

Required

4–6 years of experience in applied machine learning or data science

Strong hands-on experience with Python, scikit-learn, XGBoost, LightGBM, CatBoost, or similar libraries

Experience developing ML models in Databricks with Python or PySpark

Strong knowledge of feature engineering, model training workflows, and evaluation techniques

Experience working with large structured datasets (financial or transactional data preferred)

Ability to write clear documentation and communicate technical results to non-technical stakeholders

4+ years of hands-on experience developing, deploying, and maintaining machine-learning models

Advanced proficiency in Python (NumPy, pandas, scikit-learn, PyTorch or TensorFlow)

Strong statistical and mathematical foundation, including regression, classification, probability, optimization, etc

Experience building end-to-end ML pipelines: data ingestion, cleaning, feature engineering, modeling, evaluation, deployment

Experience working within client environments, including adapting to unfamiliar infrastructure, constraints, and security requirements

Experience with cloud platforms (AWS, Azure, or GCP) and on-prem environments

Advanced SQL ability and experience with big-data tools (Spark, Databricks, Hadoop)

Company

Straive

Glassdoor3.5

Straive is a global provider of technology-driven content and data services.

Founded in 2010

Nino, Iloilo, PHL

10001+ employees

https://www.straive.com/

Funding

Current Stage

Late Stage

Total Funding

unknown

2021-08-20Acquired

Leadership Team

Ankor Rai

Chief Executive Officer

Lori Silverstein

SVP Global Sales, Data and Content Solutions

Recent News

Newsweek

AI Impact Awards 2025: Straive Helps Cities Beat the Heat as Temperatures Rise

2025-06-21

citybiz

Straive Acquires SG Analytics

2025-06-21

Canada NewsWire

Straive Acquires SG Analytics to Bolster Data Analytics & AI Operationalization Capabilities

2025-06-19

Company data provided by crunchbase