Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

VIEW WINNERS
- ABECA 2025
  
  VIEW WINNERS
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
Participate in ABECA 2026

Add office photos

Engaged Employer

IBM

Compare

3.9

based on 23.8k Reviews

Video summary

Proud winner of ABECA 2025 - AmbitionBox Employee Choice Awards

Top Rated Mega Company - 2025

Top Rated Company for Women - 2025

Top Rated IT/ITES Company - 2025

Filter interviews by

IBM Data Scientist Interview Questions and Answers

Updated 24 Jul 2025

13 Interview questions

A Data Scientist was asked 3mo ago

Q. What advanced SQL queries were used in your project?

Ans.

Utilized advanced SQL queries for data analysis, aggregation, and reporting in various projects.

Used Common Table Expressions (CTEs) for recursive queries to analyze hierarchical data.
Implemented window functions like ROW_NUMBER() and RANK() for ranking patients based on their treatment outcomes.
Executed complex JOIN operations to merge data from multiple tables, enhancing data insights.
Applied subqueries for filt...

A Data Scientist was asked 3mo ago

Q. Write SQL queries for the following scenarios.

Ans.

SQL queries are essential for data manipulation and retrieval in databases, enabling complex data analysis and reporting.

SELECT Statement: Used to retrieve data from a database. Example: SELECT * FROM employees WHERE department = 'Sales';
JOIN Operations: Combine rows from two or more tables based on a related column. Example: SELECT orders.id, customers.name FROM orders JOIN customers ON orders.customer_id = custo...

A Data Scientist was asked 4mo ago

Q. Describe the RAG approach.

Ans.

RAG (Retrieval-Augmented Generation) combines retrieval of relevant data with generative models for enhanced information synthesis.

RAG uses a two-step process: retrieval of relevant documents followed by generation of responses based on those documents.
It leverages large language models (LLMs) to generate contextually relevant answers, improving accuracy and relevance.
For example, in a customer support chatbot, RA...

A Data Scientist was asked 4mo ago

Q. Rate your Python skills on a scale of 0 to 5.

Ans.

I would rate myself a 4 in Python, with strong skills in data manipulation, analysis, and machine learning applications.

Data Manipulation: Proficient in using libraries like Pandas for data cleaning and transformation, e.g., merging datasets and handling missing values.
Data Visualization: Experienced with Matplotlib and Seaborn for creating insightful visualizations, such as scatter plots and heatmaps.
Machine Lear...

What people are saying about IBM

View All

a data analyst

Why Global Giants Are Eyeing India

Google, IBM, SAP & more are shifting their Global Capability Centres (GCCs) from Europe to India and it’s not just cost cutting. It’s talent hunting For cities like Bengaluru, Hyderabad, and Gurugram, this means global roles, local offices. Seen a spike in such openings around you?

Got a question about IBM?

Ask anonymously on communities.

A Data Scientist was asked 10mo ago

Q. Write a Python code snippet.

Ans.

Python code is a programming language used for data analysis, machine learning, and scientific computing.

Python code is written in a text editor or an integrated development environment (IDE)
Python code is executed using a Python interpreter
Python code can be used for data manipulation, visualization, and modeling

A Data Scientist was asked 10mo ago

Q. What is Python?

Ans.

Python is a high-level programming language known for its simplicity and readability.

Python is widely used for web development, data analysis, artificial intelligence, and scientific computing.
It emphasizes code readability and uses indentation for block delimiters.
Python has a large standard library and a vibrant community of developers.
Example: print('Hello, World!')
Example: import pandas as pd

A Data Scientist was asked 10mo ago

Q. What is the code problems

Ans.

Code problems refer to issues or errors in the code that need to be identified and fixed.

Code problems can include syntax errors, logical errors, or performance issues.
Examples of code problems include missing semicolons, incorrect variable assignments, or inefficient algorithms.
Identifying and resolving code problems is a key skill for data scientists to ensure accurate and efficient data analysis.

Are these interview questions helpful?

A Data Scientist was asked 12mo ago

Q. Why did you choose this model over other models for training?

Ans.

Choosing the right model depends on data characteristics, problem complexity, and performance metrics.

Model performance: Some models may outperform others based on metrics like accuracy, precision, or recall. For example, Random Forest may perform better than Logistic Regression on complex datasets.
Data characteristics: The nature of the data (e.g., linear vs. non-linear relationships) influences model choice. For...

A Data Scientist was asked

Q. How do you perform unit testing?

Ans.

Unit testing is a process of testing individual units of code to ensure they function correctly.

Write test cases for each unit of code
Test inputs, outputs, and edge cases
Use testing frameworks like JUnit or pytest
Automate tests to run regularly
Ensure tests are independent, isolated, and repeatable

A Data Scientist was asked

Q. How can you build a Question Answering (Q&A) system using Large Language Models (LLMs)?

Ans.

A QnA system with LLM is a system that uses the Language Model for Information Retrieval and Question Answering.

Preprocess the input question and convert it into a format suitable for the LLM model.
Fine-tune the LLM model on a dataset of question-answer pairs.
Use the fine-tuned model to generate answers for new questions.
Evaluate the performance of the QnA system using metrics like precision, recall, and F1 score.
...

IBM Data Scientist Interview Experiences

13 interviews found

Data Scientist Interview Questions & Answers

Anonymous

posted on 24 Jul 2025

Interview experience

Excellent

Difficulty level

Hard

Process Duration

Less than 2 weeks

Result

Selected

I appeared for an interview in Jun 2025, where I was asked the following questions.

Q1. What is a rag?

Ans.

A rag is a piece of old cloth used for cleaning or wiping surfaces, often associated with low-cost materials.

Rags can be made from various materials like cotton, linen, or synthetic fibers.
Commonly used in households for cleaning tasks, such as dusting or mopping.
In industrial settings, rags are used for wiping machinery or absorbing spills.
Rags can be repurposed from old clothing or textiles, promoting sustainability.

Answered by AI

Add your answer

Q2. How can you create a list that is reversed, with the first letter and the third letter of each item capitalized?

Ans.

Create a reversed list with specific letters capitalized using Python.

Use a list comprehension to iterate through the original list.
Reverse the list using slicing: list[::-1].
Capitalize the first and third letters of each string using string indexing.
Example: For 'apple', it becomes 'ApPle' after processing.

Answered by AI

Add your answer

Q3. What are the various embedding techniques that you used in your project?

Ans.

Embedding techniques transform data into numerical vectors for machine learning, enhancing model performance and interpretability.

Word2Vec: Used for natural language processing to create word embeddings based on context.
GloVe: Global Vectors for Word Representation, capturing global word co-occurrence statistics.
FastText: Extends Word2Vec by considering subword information, useful for morphologically rich languages.
BER...

Answered by AI

Add your answer

Q4. What are the advantages of using transformer models compared to traditional machine learning models?

Ans.

Transformers excel in handling sequential data, capturing long-range dependencies, and outperforming traditional models in various tasks.

Self-attention mechanism allows transformers to weigh the importance of different words in a sentence, improving context understanding.
Transformers can process entire sequences simultaneously, unlike traditional models that often rely on sequential processing, enhancing efficiency.
The...

Answered by AI

Add your answer

Q5. What were the most effective strategies for addressing hallucinations?

Ans.

Effective strategies to mitigate hallucinations include data validation, model fine-tuning, and user feedback integration.

Implement data validation techniques to ensure input data quality, e.g., using cross-validation.
Fine-tune models with domain-specific datasets to improve accuracy, such as using medical literature for healthcare applications.
Incorporate user feedback loops to continuously improve model outputs, e.g....

Answered by AI

Add your answer

Q6. What peft techniques which are other techniques we can used

Ans.

PEFT techniques enhance model performance with minimal data. Other techniques include transfer learning and data augmentation.

Transfer Learning: Utilizing pre-trained models like BERT for NLP tasks.
Data Augmentation: Techniques like rotation and flipping in image datasets.
Feature Engineering: Creating new features from existing data to improve model accuracy.
Ensemble Methods: Combining multiple models to enhance predic...

Answered by AI

Add your answer

Data Scientist Interview Questions & Answers

Anonymous

posted on 28 Sep 2024

Interview experience

Average

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

I applied via Approached by Company and was interviewed in Aug 2024. There were 2 interview rounds.

Round 1 - Coding Test

*****, arjumpudi satyanarayana

Round 2 - Technical

(5 Questions)

Q1. What is the python language

Ans.

Python is a high-level programming language known for its simplicity and readability.

Python is widely used for web development, data analysis, artificial intelligence, and scientific computing.
It emphasizes code readability and uses indentation for block delimiters.
Python has a large standard library and a vibrant community of developers.
Example: print('Hello, World!')
Example: import pandas as pd

Answered by AI

Add your answer

Q2. What is the code problems

Ans.

Code problems refer to issues or errors in the code that need to be identified and fixed.

Code problems can include syntax errors, logical errors, or performance issues.
Examples of code problems include missing semicolons, incorrect variable assignments, or inefficient algorithms.
Identifying and resolving code problems is a key skill for data scientists to ensure accurate and efficient data analysis.

Answered by AI

Add your answer

Q3. What is the python code

Ans.

Python code is a programming language used for data analysis, machine learning, and scientific computing.

Python code is written in a text editor or an integrated development environment (IDE)
Python code is executed using a Python interpreter
Python code can be used for data manipulation, visualization, and modeling

Answered by AI

Add your answer

Q4. What is the project

Add your answer

Q5. What is the lnderssip

Add your answer

Interview Preparation Tips

Topics to prepare for IBM Data Scientist interview:

Python
Machine Learning

Interview preparation tips for other job seekers - No

Skills evaluated in this interview

Data Scientist Interview Questions & Answers

Anonymous

posted on 25 Jul 2024

Interview experience

Good

Difficulty level

Process Duration

Result

Round 1 - Technical

(3 Questions)

Q1. About Machine learning basics, activation functions linear regression, cnn, all basics..

Add your answer

Q2. About project questions, about sdlc basic 3 questions

Add your answer

Q3. About Why not used another model for training?

Ans.

Choosing the right model depends on data characteristics, problem complexity, and performance metrics.

Model performance: Some models may outperform others based on metrics like accuracy, precision, or recall. For example, Random Forest may perform better than Logistic Regression on complex datasets.
Data characteristics: The nature of the data (e.g., linear vs. non-linear relationships) influences model choice. For inst...

Answered by AI

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - prepare Machine learning basics and project details well..

Data Scientist Interview Questions & Answers

carriers 2024

posted on 10 Dec 2024

Interview experience

Average

Difficulty level

Process Duration

Result

Round 1 - Coding Test

DSA,ML,SQL,stats,DL,

Data Scientist Interview Questions & Answers

Anonymous

posted on 29 Mar 2025

Interview experience

Average

Difficulty level

Process Duration

Less than 2 weeks

Result

I appeared for an interview in Mar 2025, where I was asked the following questions.

Q1. Describe RAG approach

Ans.

RAG (Retrieval-Augmented Generation) combines retrieval of relevant data with generative models for enhanced information synthesis.

RAG uses a two-step process: retrieval of relevant documents followed by generation of responses based on those documents.
It leverages large language models (LLMs) to generate contextually relevant answers, improving accuracy and relevance.
For example, in a customer support chatbot, RAG can...

Answered by AI

Add your answer

Q2. How to design conversational flow

Ans.

Designing conversational flow involves structuring dialogue for clarity, engagement, and user satisfaction.

Define user goals: Understand what users want to achieve, e.g., booking an appointment.
Map out conversation paths: Create flowcharts to visualize possible dialogues.
Use natural language: Ensure the bot understands and responds in a human-like manner.
Incorporate error handling: Plan for misunderstandings and provid...

Answered by AI

Add your answer

Data Scientist Interview Questions & Answers

Anonymous

posted on 24 Dec 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

No response

I applied via Company Website and was interviewed in Nov 2024. There was 1 interview round.

Round 1 - Aptitude Test

Its really easy if you know python well

Data Scientist Interview Questions & Answers

Anonymous

posted on 25 Feb 2024

Interview experience

Good

Difficulty level

Process Duration

Result

Round 1 - Coding Test

60 min hackerrank test,with one mysql medium difficulty question and one python medium/little hard level difficulty

Round 2 - One-on-one

(1 Question)

Q1. Technical covering each topic from stats, python, ml, dl, nlp,project

Add your answer

Round 3 - One-on-one

(1 Question)

Q1. Project-related in-depth discussion, few case scenarios, stats question

Add your answer

Data Scientist Interview Questions & Answers

Anonymous

posted on 30 Mar 2025

Interview experience

Good

Difficulty level

Hard

Process Duration

2-4 weeks

Result

Selected

I appeared for an interview before Mar 2024, where I was asked the following questions.

Q1. What advanced SQL queries were used in your project?

Ans.

Utilized advanced SQL queries for data analysis, aggregation, and reporting in various projects.

Used Common Table Expressions (CTEs) for recursive queries to analyze hierarchical data.
Implemented window functions like ROW_NUMBER() and RANK() for ranking patients based on their treatment outcomes.
Executed complex JOIN operations to merge data from multiple tables, enhancing data insights.
Applied subqueries for filtering...

Answered by AI

Add your answer

Q2. Could you please explain your project in detail?

Add your answer

Q3. Window's function in SQL

Ans.

Window functions in SQL allow for performing calculations across a set of table rows related to the current row.

Window functions operate on a set of rows defined by an OVER() clause.
They do not change the number of rows returned by a query.
Common window functions include ROW_NUMBER(), RANK(), and SUM().
Example: SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS rank FROM employees;
Window functions can be partit...

Answered by AI

Add your answer

Q4. Probability related questions

Add your answer

Q5. Write some SQL queries on given situations

Ans.

SQL queries are essential for data manipulation and retrieval in databases, enabling complex data analysis and reporting.

SELECT Statement: Used to retrieve data from a database. Example: SELECT * FROM employees WHERE department = 'Sales';
JOIN Operations: Combine rows from two or more tables based on a related column. Example: SELECT orders.id, customers.name FROM orders JOIN customers ON orders.customer_id = customers....

Answered by AI

Add your answer

Data Scientist Interview Questions & Answers

Rohit Mishra

posted on 13 May 2024

Interview experience

Average

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

I applied via Company Website and was interviewed in Nov 2023. There was 1 interview round.

Round 1 - Technical

(1 Question)

Q1. Can you discuss one of your project in detail and why have you chosen those specific model to start with?

Add your answer

Data Scientist Interview Questions & Answers

Anonymous

posted on 5 Feb 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

2-4 weeks

Result

Selected

I applied via Job Portal and was interviewed before Feb 2023. There was 1 interview round.

Round 1 - Technical

(3 Questions)

Q1. What are hyperparameters in random forest

Ans.

Hyperparameters in random forest are parameters that are set before the learning process begins.

Hyperparameters control the behavior of the random forest algorithm.
They are set by the data scientist and are not learned from the data.
Examples of hyperparameters in random forest include the number of trees, the maximum depth of trees, and the number of features considered at each split.

Answered by AI

Add your answer

Q2. How to do QnA system with LLM

Ans.

A QnA system with LLM is a system that uses the Language Model for Information Retrieval and Question Answering.

Preprocess the input question and convert it into a format suitable for the LLM model.
Fine-tune the LLM model on a dataset of question-answer pairs.
Use the fine-tuned model to generate answers for new questions.
Evaluate the performance of the QnA system using metrics like precision, recall, and F1 score.
Itera...

Answered by AI

Add your answer

Q3. How to do unit testing

Ans.

Unit testing is a process of testing individual units of code to ensure they function correctly.

Write test cases for each unit of code
Test inputs, outputs, and edge cases
Use testing frameworks like JUnit or pytest
Automate tests to run regularly
Ensure tests are independent, isolated, and repeatable

Answered by AI

Add your answer

Skills evaluated in this interview

IBM Interview FAQs

How many rounds are there in IBM Data Scientist interview?

IBM interview process usually has 1-2 rounds. The most common rounds in the IBM interview process are Technical, Coding Test and One-on-one Round.

How to prepare for IBM Data Scientist interview?

Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at IBM. The most common topics and skills that interviewers at IBM expect are Python, Open Source, Artificial Intelligence, Machine Learning and SQL.

What are the top questions asked in IBM Data Scientist interview?

Some of the top questions asked at the IBM Data Scientist interview -

How can you create a list that is reversed, with the first letter and the third...read more
What PCA, Decision tree and computer vis...read more
What are the advantages of using transformer models compared to traditional mac...read more

How long is the IBM Data Scientist interview process?

The duration of IBM Data Scientist interview process can vary, but typically it takes about less than 2 weeks to complete.

Tell us how to improve this page.

IBM Interviews By Designations

Interview Questions for Popular Designations

3.9/5

based on 17 interview experiences

Difficulty level

Easy 11%

Moderate 67%

Hard 22%

Duration

Less than 2 weeks 70%

2-4 weeks 30%

TCS Data Scientist Interview Questions

3.6

• 36 Interviews

Accenture Data Scientist Interview Questions

3.7

• 33 Interviews

Infosys Data Scientist Interview Questions

3.6

• 20 Interviews

Cognizant Data Scientist Interview Questions

3.7

• 16 Interviews

Capgemini Data Scientist Interview Questions

3.7

• 16 Interviews

Genpact Data Scientist Interview Questions

3.7

• 12 Interviews

Tech Mahindra Data Scientist Interview Questions

3.5

• 5 Interviews

NTT Data Data Scientist Interview Questions

3.8

• 5 Interviews

NielsenIQ Data Scientist Interview Questions

3.6

• 5 Interviews

Wipro Data Scientist Interview Questions

3.7

• 4 Interviews

View all

IBM Data Scientist Salary

based on 950 salaries

₹9.7 L/yr - ₹31.2 L/yr

43% more than the average Data Scientist Salary in India

View more details

Data Scientist Jobs at IBM

Data Scientist-Artificial Intelligence

Bangalore / Bengaluru

3-7 Yrs

₹ 7.5-31 LPA

DATA SCIENTIST-ADVANCED ANALYTICS

Bangalore / Bengaluru

3-7 Yrs

₹ 7.5-31 LPA

Data Scientist-Artificial Intelligence

Pune

3-7 Yrs

₹ 5.1-19 LPA

Explore more jobs

IBM Salaries in India

Application Developer 12.7k salaries	₹5.3 L/yr - ₹26.8 L/yr
Software Developer 6k salaries	₹13.5 L/yr - ₹35.1 L/yr
Software Engineer 5.9k salaries	₹8.3 L/yr - ₹24.9 L/yr
Senior Software Engineer 5.5k salaries	₹13.2 L/yr - ₹31.7 L/yr
Advisory System Analyst 4.5k salaries	₹13.6 L/yr - ₹23.1 L/yr