OKAYA INFOCOM · 13 hours ago
Remote--Data Scientist Architect--Salt Lake City, UT--Full Time
OKAYA INFOCOM is seeking a Data Scientist Architect to work remotely. The role involves interpreting data transformation logic, validating feature pipelines, and collaborating with model validation teams to ensure model performance and consistency.
Responsibilities
Interpret data transformation logic and validate feature pipelines from existing Java implementations
Run Python-converted models on historical datasets and validate output metrics against Java model benchmarks
Collaborate with model validation teams to review performance, consistency and explain metric deviations if any
Design unit tests and validation scenarios to support each migrated model’s readiness for signoff
Ingest model input data from parquet files using PySpark and pandas to reproduce training and scoring workflows
Conduct EDA and spot-check row-level predictions where needed Collaborate with the customer team to understand the logic, structure and parameters of the Java-based XGBoost models
Qualification
Required
7-10 years hands-on with Python for machine learning – especially XGBoost, scikit-learn and NumPy/pandas
Proficiency in PySpark for reading, transforming and analyzing large datasets stored in parquet
Experience in validating or reverse engineering ML models from business logic or legacy implementation
Exposure to Java-based ML libraries or understanding of how internals map across languages
Hands-on with Python frameworks for meta-modelling libraries