Responsibilities:
- Own the full ML lifecycle: model design, training, evaluation, deployment
- Design production-ready ML pipelines with CI/CD, testing, monitoring, and drift detection
- Fine-tune LLMs and implement retrieval-augmented generation (RAG) pipelines
- Build agentic workflows for reasoning, planning, and decision-making
- Develop both real-time and batch inference systems using Docker, Kubernetes, and Spark
- Leverage state-of-the-art architectures: transformers, diffusion models, RLHF, and multimodal pipelines
- Collaborate with product and engineering teams to integrate AI models into business applications
- Mentor junior team members and promote MLOps, scalable architecture, and responsible AI best practices
Ideal Candidate:
- 5+ years of experience in designing, deploying, and scaling ML/DL systems in production
- Proficient in Python and deep learning frameworks such as PyTorch, TensorFlow, or JAX
- Experience with LLM fine-tuning, LoRA/QLoRA, vector search (Weaviate/PGVector), and RAG pipelines
- Familiarity with agent-based development (e.g., ReAct agents, function-calling, orchestration)
- Solid understanding of MLOps: Docker, Kubernetes, Spark, model registries, and deployment workflows
- Strong software engineering background with experience in testing, version control, and APIs
- Proven ability to balance innovation with scalable deployment
- B.S./M.S./Ph.D. in Computer Science, Data Science, or a related field
Bonus: Open-source contributions, GenAI research, or applied systems at scale
Didn’t find the job appropriate? Report this Job