2 Positive Synergy Jobs
Lead Data Scientist - Generative AI (5-8 yrs)
Positive Synergy
posted 1 week ago
Key skills for the job
Job Description :
Seeking a Lead Data Scientist (Generative AI) to spearhead the development of advanced AI-powered classification and matching systems on Databricks. You will contribute to flagship programs like the Diageo AI POC by building RAG pipelines, deploying agentic AI workflows, and scaling LLM-based solutions for high-precision entity matching and MDM modernization.
Key Responsibilities :
- Design and implement end-to-end AI pipelines for product classification, fuzzy matching, and deduplication using LLMs, RAG, and Databricks-native workflows.
- Develop scalable, reproducible AI solutions within Databricks notebooks and job clusters, leveraging Delta Lake, MLflow, and Unity Catalog.
- Engineer Retrieval-Augmented Generation (RAG) workflows using vector search and integrate with Python-based matching logic.
- Build agent-based automation pipelines (rule-driven + GenAI agents) for anomaly detection, compliance validation, and harmonization logic.
- Implement explainability, audit trails, and governance-first AI workflows aligned with enterprise-grade MDM needs.
- Collaborate with data engineers, BI teams, and product owners to integrate GenAI outputs into downstream systems.
- Contribute to modular system design and documentation for long-term scalability and maintainability.
Qualifications :
- Bachelors/Masters in Computer Science, Artificial Intelligence, or related field.
- 5-7 years of overall Data Science experience with 2+ years in Generative AI / LLM-based applications.
- Deep experience with Databricks ecosystem: Delta Lake, MLflow, DBFS, Databricks Jobs & Workflows.
- Strong Python and PySpark skills with ability to build scalable data pipelines and AI workflows in Databricks.
- Experience with LLMs (e.g., OpenAI, LLaMA, Mistral) and frameworks like LangChain or LlamaIndex.
- Working knowledge of vector databases (e.g., FAISS, Chroma) and prompt engineering for classification/retrieval.
- Exposure to MDM platforms (e.g., Stibo STEP) and familiarity with data harmonization challenges.
- Experience with explainability frameworks (e.g., SHAP, LIME) and AI audit tooling.
Preferred Skills :
- Knowledge of agentic AI architectures and multi-agent orchestration. - Familiarity with Azure Data Hub and enterprise data ingestion frameworks. - Understanding of data governance, lineage, and regulatory compliance in AI systems.
Functional Areas: Other
Read full job description5-8 Yrs
Data Science, Python, Artificial Intelligence +3 more
10-16 Yrs
Data Analytics, Python, Artificial Intelligence +6 more