SES AI Corp · 3 days ago
Senior Data Scientist, Natural Language Processing and Data Annotation Expert
Wonder how qualified you are to the job?
Insider Connection @SES AI Corp
Responsibilities
Lead the design and implementation of advanced NLP techniques and methodologies to extract intricate scientific concepts and reasoning from vast textual sources.
Lead the design and implementation of advanced NLP techniques and methodologies to extract chemical information including SMILES notations, properties, and interleaved text for multimodal language model training and chemical property predictions.
Develop and refine language-based data labeling pipelines tailored for scientific discovery, ensuring high-quality annotated datasets for training large language models and AI agents.
Collaborate closely with cross-functional teams to identify key research areas and define labeling strategies to capture nuanced scientific insights effectively.
Spearhead the development of innovative approaches for data annotation, incorporating state-of-the-art NLP algorithms to enhance accuracy and efficiency.
Provide expert guidance on data annotation best practices, ensuring consistency and quality across labeled datasets.
Conduct thorough analyses to evaluate the effectiveness of labeling pipelines and make continuous improvements to optimize performance.
Stay abreast of the latest advancements in NLP and data annotation techniques, integrating emerging methodologies to enhance our data labeling capabilities.
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
Advanced degree (master's or PhD preferred) in computer science, data science, or a related field.
Extensive hands-on experience in natural language processing, with a strong emphasis on designing and implementing language-based data labeling pipelines.
Proven track record of leveraging NLP techniques to extract complex scientific concepts and reasoning from textual sources.
Familiarity with deep learning models and architectures for NLP tasks, such as transformer-based models (e.g., BERT, GPT).
Proficiency with Git and Linux based systems and proficiency in programming languages such as Python, R, or Java, along with expertise in relevant libraries and frameworks (e.g., PyTorch, NLTK, TensorFlow).
Exceptional problem-solving skills with meticulous attention to detail, coupled with a passion for advancing scientific discovery through data science.
Excellent communication and collaboration skills, with the ability to effectively convey complex technical concepts to diverse stakeholders.
Preferred
Experience with AI agent's studies, using knowledge-based Retrieval-Augmented Generation (RAG) to facilitate the accuracy of language generation.
Experience with cloud computing platforms and services (e.g., AWS, Azure, Google Cloud) for scalable data processing and storage.
Knowledge of data visualization techniques and tools for exploring and presenting scientific insights.
Company
SES AI Corp
SES is developing a Mine-to-Man AI software system for Li-Metal batteries.
Funding
Current Stage
Public CompanyTotal Funding
$600.11MKey Investors
Honda MotorHyundai Motor CompanyGeneral Motors
2022-02-04Post Ipo Equity· $275M
2022-02-04IPO· nyse:SES
2021-07-05Corporate Round· $100M
Recent News
Business Wire
2024-04-30
2024-04-30
2024-04-24
Company data provided by crunchbase