Data Scientist - NLP jobs in United States
cer-icon
Apply on Employer Site
company-logo

Analytica · 1 month ago

Data Scientist - NLP

Analytica is a leading consulting and information technology solutions provider to public sector organizations. They are seeking a Data Scientist to support long-term federal client engagement projects, applying statistical programming and modeling techniques to analyze public sector problems.

AnalyticsBig DataBusiness IntelligenceConsulting
badNo H1BnoteSecurity Clearance RequirednoteU.S. Citizen Onlynote

Responsibilities

• Pre-processing - Demonstrate the skills and experience to collect, clean, and prepare data sets for input into a computational model using Python. Strong candidates will explain various methods you have applied using common pre-processing functions such as stop word removal, stemming, lemmatization, and tokenization
• Feature Engineering and Attribute Evaluation - Candidate must demonstrate experience with NLP feature engineering methods such as TF-IDF, word2vec, GloVe, and FastText identifying the key determinants for modeling that exist in the business process and within existing data sets as well as selecting evaluation protocols (model techniques)
• Modeling - Candidates will have practiced skills and experience selecting classification modeling techniques to fit the business problem. Examples will include techniques such as machine learning (ML) supervised and unsupervised learning, regression, neural networks and deep learning, natural language processing, etc
• Validation - Strong candidates will describe their experience with investigating, reporting, and justifying model results
• Visualization- Experience in presenting the results of their modeling activities, depicting the insights realized, and explaining the relevance of their results to the organization’s business challenges

Qualification

PythonNLP feature engineeringMachine learningSASRTransformer architectureGit/GitHubOpen source NLP packagesAWS cloud environmentSoft skills

Required

Master's degree required, and PhD preferred in Statistics, Mathematics, Computer Science, or similar
High degree of experience utilizing SAS, R, or Python to support NLP use cases such as Document Summarization, Named Entity Recognition, Sentiment Analysis, and/or Topic Modeling
At least four years of experience developing scalable, production-ready NLP solutions using sci-kit learn, Keras, TensorFlow, PyTorch, Spark NLP
Experience using git/github to version control source code
Experience leveraging transformer architecture to develop NLP models
Experience with open source NLP packages such as Gensim, SpaCy, or NLTK
Experience with BERT, GPT-J, RoBERTa, T5 or other transformers
Must be a US citizen
Must be able to obtain and maintain a Public trust security clearance

Preferred

Experience with GenAI and Prompt Engineering is a plus
Experience in Databricks and MLFlow is a plus
Experience with machine translation and transcription of foreign language documents using Microsoft Azure translation services is a plus
Experience working in an AWS cloud environment and with related AWS services such as Bedrock and Textract
Experience coordinating and maintaining user stories

Benefits

Competitive compensation
Opportunities for bonuses
Employer paid health care
Training and development funds
401k match

Company

Analytica

twittertwittertwitter
company-logo
Analytica has built a team of exceptionally dedicated analysts, associates and public sector subject matter experts.

Funding

Current Stage
Growth Stage

Leadership Team

leader-logo
Mariano Lopez
Founder, Managing Member & CEO
linkedin
leader-logo
Regina Dull
Chief Financial Officer
linkedin
Company data provided by crunchbase