- Data Scientist with 3-4 years of experience to play a critical role in enhancing Large Language Model (LLM) development lifecycle.

- Will be responsible for designing and building sophisticated, LLM-assisted Quality Assurance (QA) solutions.

- The primary goal is to analyze model failures, identify data gaps, and create real-time tools that guide our human data generators to produce high-impact training data.

- This role is highly analytical and technical, sitting at the critical intersection of model evaluation, data analysis, and human-in-the-loop process improvement.

Key Responsibilities

- Develop LLM-Assisted QA Solutions: Design, build, and deploy intelligent tools that assist human data generators in real-time, verifying that new data aligns with identified model needs.

- Analyze Model Failures: Conduct deep-dive analyses into model failure modes to identify and categorize new loss patterns and emerging weaknesses.

- Run Studies: Systematically design and execute experiments to understand model behavior and pinpoint the root causes of errors.

- Define Data Requirements: Translate your analysis of model failures into specific, actionable data requirements for our human data generation teams to target for model improvement.

- Create Quality Rubrics: Develop, document, and maintain comprehensive quality control rubrics and evaluation metrics.

- These rubrics must be adaptable across a wide variety of use cases, domains, and industry sectors.

- Verify Data Generation: Build processes to validate that the human-generated data effectively targets and suits the existing and newly identified loss patterns.

- Collaborate Cross-Functionally: Work closely with ML Engineers, AI Researchers, and Data Operations teams to ensure your QA solutions and insights are seamlessly integrated into the model training and deployment pipeline.

Required Qualifications

- Experience: 3-4 years of professional experience in Data Science, Machine Learning Engineering, or a related role with a focus on NLP.

- Education: Bachelor's or Master's degree in Computer Science, Data Science, Statistics, Computational Linguistics, or a related quantitative field.

- LLM/NLP Expertise: Strong hands-on experience with Large Language Models (LLMs), NLP techniques, and the modern transformer ecosystem ( transformers library, GPT-family, BERT, T5).

- Technical Skills: High proficiency in Python and standard data science/ML libraries ( Pandas, NumPy, Scikit-learn, PyTorch/TensorFlow).

- Analytical Mindset: Proven ability to perform deep, rigorous analysis on complex and often unstructured data (model outputs, failure logs) to derive actionable insights.

- Strong Communication: Excellent ability to create clear, concise documentation (especially technical rubrics) and communicate complex findings to both technical and non-technical stakeholders.

Preferred Qualifications (Nice-to-Have)

- Direct experience building human-in-the-loop (HITL) systems or data annotation/QA tools.

- Experience in experimental design and A/B testing within an ML context.

- Familiarity with data-centric AI principles and practices.

- Background in MLOps (experiment tracking, model versioning, deployment).

- Experience working in a fast-paced R&D or product-driven environment.