Associate Data Scientist - Environmental Modeling jobs in United States
cer-icon
Apply on Employer Site
company-logo

Bayer · 12 hours ago

Associate Data Scientist - Environmental Modeling

Bayer is a global company focused on science and innovation in agriculture. They are seeking an Associate Data Scientist specializing in Environmental Modeling to design and build statistical and machine learning models for crop yield testing, automate analytics workflows, and develop methodologies for integrating various data types. The role involves collaboration to provide data-driven solutions to business problems and requires a strong foundation in quantitative fields.

BiotechnologyChemicalHealth CareLife SciencePharmaceutical
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Design & build statistical, machine learning and deep learning models to quantify subfield-scale yield testing environments of crops
Automate analytics workflows
Develop next generation methodologies for integrative usage of genomic, phenomic & environmental data
Determine environmental correlations among testing locations & global regions
Design statistical modeling frameworks & prediction models to drive product placement recommendations and yield predictions
Collaborate to provide data-driven statistical solutions to business problems
Using object-oriented programming techniques to write Python packages to analyze high dimensional environmental data with Gap Statistics
Developing & selecting unsupervised learning algorithms to analyze high-dimensional environmental data, including K-means, agglomerative hierarchical clustering, and/or Gaussian mixture models
Using statistical & machine learning packages, including Tensorflow, Pandas, Multiprocessing, Joblib, Numpy, SciPy, Scikit-Learn, Keras, PyTorch, PySpark, and/or Dask, to develop discovery and production ready models for analysis of phenotypic and geospatial data
Adhering to and/or enforcing coding best practices
Using code management tools, including GitHub, to ensure the reproducibility of data science
Aggregating & summarizing complex datasets using GCP BigQuery, Presto, Superset, and AWS RedShift
Building heat, drought, and cold stress models over global regions using high dimensional environmental data
Automating workflows using AWS Sagemaker, Google Cloud Platform, Airflow, & Docker
Performing data operations, including spatial joins, zonal statistics, & re-projecting
Quantifying similarity scores between different environments & using distance metrics to compare multivariate time series environmental data related to major row crops
Visualizing geospatial data, including vector & raster files, using QGIS, Google BigQuery, and/or Python libraries
Performing data quality checks using deep learning-based anomaly detection on time-series data
Designing, training & optimizing neural networks for generating embeddings using AutoEncoder for multivariate time series-based data

Qualification

PythonMachine LearningStatistical ModelingDeep LearningObject-Oriented ProgrammingData VisualizationData OperationsCloud ComputingCollaborationProblem SolvingCommunication

Required

Master's in Statistics, Mathematics, or closely related quantitative field
1 yr experience using object-oriented programming techniques to write Python packages to analyze high dimensional environmental data with Gap Statistics
developing & selecting unsupervised learning algorithms to analyze high-dimensional environmental data, including K-means, agglomerative hierarchical clustering, and/or Gaussian mixture models
using statistical & machine learning packages, including Tensorflow, Pandas, Multiprocessing, Joblib, Numpy, SciPy, Scikit-Learn, Keras, PyTorch, PySpark, and/or Dask, to develop discovery and production ready models for analysis of phenotypic and geospatial data
adhering to and/or enforcing coding best practices
using code management tools, including GitHub, to ensure the reproducibility of data science
aggregating & summarizing complex datasets using GCP BigQuery, Presto, Superset, and AWS RedShift
building heat, drought, and cold stress models over global regions using high dimensional environmental data
automating workflows using AWS Sagemaker, Google Cloud Platform, Airflow, & Docker
performing data operations, including spatial joins, zonal statistics, & re-projecting
quantifying similarity scores between different environments & using distance metrics to compare multivariate time series environmental data related to major row crops
visualizing geospatial data, including vector & raster files, using QGIS, Google BigQuery, and/or Python libraries
performing data quality checks using deep learning-based anomaly detection on time-series data
designing, training & optimizing neural networks for generating embeddings using AutoEncoder for multivariate time series-based data

Benefits

Health care
Vision
Dental
Retirement
PTO
Sick leave

Company

Bayer is a life science company that specializes in the areas of health care and agriculture.

H1B Sponsorship

Bayer has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (62)
2024 (71)
2023 (76)
2022 (141)
2021 (138)
2020 (117)

Funding

Current Stage
Public Company
Total Funding
$9.34B
Key Investors
Bank of AmericaBill & Melinda Gates FoundationTemasek Holdings
2025-09-26Post Ipo Debt· $331.5M
2024-12-06Post Ipo Debt· $5.29B
2022-11-08Grant· $12M

Leadership Team

leader-logo
Wolfgang Nickl
Member of the Board of Management of Bayer AG and CFO
linkedin
leader-logo
Abi Abitorabi
VP, CGT Platform Implementation Lead
linkedin
Company data provided by crunchbase