53 applicants

Company

Qualitest · 20 hours ago

#16711 - Sr. Data Scientist in Test

USA

Full-time

Remote

Senior Level

$145K/yr - $155K/yr

Maximize your interview chances

ConsultingEnterprise Software

Actively Hiring

Insider Connection @Qualitest

Discover valuable connections within the company who might provide insights and potential referrals.
Get 3x more responses when you reach out via email instead of LinkedIn.

Responsibilities

Develop test strategies for evaluating AI/ML models, ensuring outputs align with business requirements and user expectations.

Design and execute evaluation pipelines using frameworks like DeepEval for generative AI model testing.

Automate the evaluation of model accuracy, fluency, factuality, safety, and bias.

Create adversarial test cases to validate AI behavior under edge scenarios, such as data poisoning, jailbreaks, and prompt injections.

Assess and validate the retrieval augmented generation (RAG) system, including retrieval accuracy and latency.

Build automated testing for conversational AI chatbots, covering dialog coherence, context retention, and response diversity.

Use metrics like BLEU, ROUGE, perplexity, and embedding similarity to evaluate generative model output.

Collaborate with data scientists, ML engineers, and developers to address bugs and performance issues.

Implement tools for test data generation that simulate real-world user inputs and edge cases.

Report model performance through dashboards and metrics-driven reporting frameworks.

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

AI/ML testingGenerative AINLP modelsPythonDeepEvalLangsmithLangTestHugging Face toolsAI testing metricsAdversarial testingMLflowWeights & BiasesLLM architecturesPrompt engineeringRAG conceptsJupyter NotebooksPandasNumPyMatplotlib

Required

Strong background in AI/ML testing, with hands-on experience in generative AI or NLP models.

Proficiency in Python and testing libraries like DeepEval, Langsmith, LangTest, or Hugging Face evaluation tools.

Knowledge of AI-specific testing metrics and adversarial testing methodologies.

Experience with model evaluation frameworks like MLflow, Weights & Biases, or custom pipelines.

Familiarity with LLM architectures (e.g., GPT, BERT) and concepts like prompt engineering and RAG.

Strong analytical mindset and problem-solving skills for identifying AI model failures.

Experience with tools like Jupyter Notebooks, Pandas, NumPy, and visualization libraries (e.g., Matplotlib).

Benefits

401k plan where Qualitest will match your contributions accelerating your savings plan.

Enrollment into one of our competitive healthcare benefits.

Qualitest will match towards your HSA if you choose to participate.

Qualitest Tech academy: 3000+ training courses, mentorship programs, technical tribes, sponsored certifications, leadership programs and much more

Corporate Wellness Program. We pay your Gym membership and giving you opportunities to Earn additional vacation times for attendance the gym!

Bonuses via our Client Referral and Employee Referral Program’s.

Access to Qualitest Employee Perks for discounts on anything from travel to electronics.

Company

Qualitest

Glassdoor

3.5

Qualitest is the world’s leading managed services provider of AI-led quality engineering solutions.

Founded in 1997

London, England, GBR

5001-10000 employees

http://www.qualitestgroup.com/

Funding

Current Stage

Late Stage

Total Funding

unknown

2019-07-10Acquired

Leadership Team

Chris Wilmot

CFO

Harvey Feuer

CFO

Company data provided by crunchbase

Orion

Your AI Copilot