Premium Employer

i

This company page is being actively managed by Infosys Team. If you also belong to the team, you can get access from here

Infosys Verified Tick Work with us arrow

Compare button icon Compare button icon Compare

Filter interviews by

Infosys Data Scientist Interview Questions and Answers

Updated 4 Apr 2025

14 Interview questions

🔥 Asked by recruiter 5 times
A Data Scientist was asked 4mo ago
Q. What are the core concepts of Object-Oriented Programming (OOP)?
Ans. 

OOP concepts include encapsulation, inheritance, polymorphism, and abstraction, essential for structured programming.

  • Encapsulation: Bundling data and methods in a class. Example: A class 'Car' with attributes like 'speed' and methods like 'accelerate()'.

  • Inheritance: Deriving new classes from existing ones. Example: 'ElectricCar' inherits from 'Car', adding features like 'batteryCapacity'.

  • Polymorphism: Ability to t...

A Data Scientist was asked
Q. With the XGBoost algorithm using 10-20 features, how are the splits decided, and on which feature will they be divided?
Ans. 

XgBoost algorithm uses a greedy approach to determine splits based on feature importance.

  • XgBoost algorithm calculates the information gain for each feature to determine the best split.

  • The feature with the highest information gain is chosen for the split.

  • This process is repeated recursively for each node in the tree.

  • Features can be split based on numerical values or categories.

  • Example: If a feature like 'age' has t...

Data Scientist Interview Questions Asked at Other Companies

Q1. for a data with 1000 samples and 700 dimensions, how would you fi ... read more
Q2. Special Sum of Array Problem Statement Given an array 'arr' conta ... read more
asked in Affine
Q3. You have a pandas dataframe with three columns filled with state ... read more
asked in Walmart
Q4. Describe the data you would analyze to solve cost and revenue opt ... read more
Q5. Clone a Linked List with Random Pointers Given a linked list wher ... read more
A Data Scientist was asked
Q. Explain precision and recall, and when each is used.
Ans. 

Precision and recall are metrics used in evaluating the performance of classification models.

  • Precision measures the accuracy of positive predictions, while recall measures the ability of the model to find all positive instances.

  • Precision = TP / (TP + FP)

  • Recall = TP / (TP + FN)

  • Precision is important when false positives are costly, while recall is important when false negatives are costly.

  • For example, in a spam ema...

A Data Scientist was asked
Q. What is data imbalance?
Ans. 

Data imbalance refers to unequal distribution of classes in a dataset, where one class has significantly more samples than others.

  • Data imbalance can lead to biased models that favor the majority class.

  • It can result in poor performance for minority classes, as the model may struggle to accurately predict them.

  • Techniques like oversampling, undersampling, and using different evaluation metrics can help address data i...

A Data Scientist was asked
Q. Explain the XGBoost algorithm.
Ans. 

XGBoost is a powerful machine learning algorithm known for its speed and performance in handling large datasets.

  • XGBoost stands for eXtreme Gradient Boosting, which is an implementation of gradient boosting machines.

  • It is widely used in machine learning competitions and is known for its speed and performance.

  • XGBoost uses a technique called boosting, where multiple weak learners are combined to create a strong learn...

What are the roles & responsibilities of a Data Scientist at Infosys?

Machine Learning Development

  • Anchor ML development track in client projects
  • AI model development, experimentation, tuning, and validation

Read full roles & responsibilities

A Data Scientist was asked
Q. What is L1 and L2 Regularization?
Ans. 

L1 and L2 regularization are techniques used in machine learning to prevent overfitting by adding penalty terms to the cost function.

  • L1 regularization adds the absolute values of the coefficients as penalty term to the cost function.

  • L2 regularization adds the squared values of the coefficients as penalty term to the cost function.

  • L1 regularization can lead to sparse models by forcing some coefficients to be exactl...

Infosys HR Interview Questions

880 questions and answers

Q. How have you addressed security concerns in your project?
Q. Explain your last project.
Q. What aspects of your resume would you like to highlight?
A Data Scientist was asked
Q. What is data science?
Ans. 

Data science is the field of extracting insights and knowledge from data using various techniques and tools.

  • Data science involves collecting, cleaning, and analyzing data to extract insights.

  • It uses various techniques such as machine learning, statistical modeling, and data visualization.

  • Data science is used in various fields such as finance, healthcare, and marketing.

  • Examples of data science applications include ...

Are these interview questions helpful?
A Data Scientist was asked
Q. What is SMOTE? Do you have any experience working on Time Series? Code analysis of global variable?
Ans. 

SMOTE stands for Synthetic Minority Over-sampling Technique, used to balance imbalanced datasets by generating synthetic samples.

  • SMOTE is commonly used in machine learning to address class imbalance by creating synthetic samples of the minority class.

  • It works by generating new instances of the minority class by interpolating between existing instances.

  • SMOTE is particularly useful in scenarios where the minority cl...

A Data Scientist was asked
Q. What is entropy, information gain?
Ans. 

Entropy is a measure of randomness or uncertainty in a dataset, while information gain is the reduction in entropy after splitting a dataset based on a feature.

  • Entropy is used in decision tree algorithms to determine the best feature to split on.

  • Information gain measures the effectiveness of a feature in classifying the data.

  • Higher information gain indicates that a feature is more useful for splitting the data.

  • Ent...

A Data Scientist was asked
Q. What is activation function? Explain Naive Bayes? Confusion matrix? Hyperparameters in DL? Hypothesis testing
Ans. 

Activation function is a mathematical function used in neural networks to introduce non-linearity.

  • Activation function is applied to the weighted sum of inputs in a neural network node.

  • It helps in determining the output of a node or the activation of a neuron.

  • Common activation functions include sigmoid, tanh, ReLU, and softmax.

  • Activation functions introduce non-linearity, allowing neural networks to learn complex p...

Infosys Data Scientist Interview Experiences

20 interviews found

Interview experience
1
Bad
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
No response

I appeared for an interview in Feb 2025.

Round 1 - Technical 

(2 Questions)

  • Q1. Deployment of RAG
  • Ans. 

    RAG (Retrieval-Augmented Generation) deployment enhances AI models by integrating external data sources for improved responses.

    • Integrate RAG with existing NLP models to enhance context understanding.

    • Utilize APIs to fetch real-time data, improving response accuracy.

    • Example: Using RAG in customer support to pull relevant FAQs from a database.

    • Implement caching mechanisms to optimize retrieval speed.

    • Monitor and evaluate mo...

  • Answered by AI
  • Q2. Building of RAG
  • Ans. 

    RAG (Red, Amber, Green) is a visual tool for assessing project status and risk levels.

    • RAG status indicates project health: Red = critical issues, Amber = potential risks, Green = on track.

    • Example: A project with budget overruns may be marked Red.

    • RAG can be used in dashboards for quick visual assessments.

    • Regular updates to RAG status help in proactive risk management.

  • Answered by AI
Interview experience
4
Good
Difficulty level
Moderate
Process Duration
2-4 weeks
Result
Not Selected

I applied via Job Portal and was interviewed in Apr 2024. There was 1 interview round.

Round 1 - Technical 

(9 Questions)

  • Q1. Explain XGBoost algoritm
  • Ans. 

    XGBoost is a powerful machine learning algorithm known for its speed and performance in handling large datasets.

    • XGBoost stands for eXtreme Gradient Boosting, which is an implementation of gradient boosting machines.

    • It is widely used in machine learning competitions and is known for its speed and performance.

    • XGBoost uses a technique called boosting, where multiple weak learners are combined to create a strong learner.

    • It...

  • Answered by AI
  • Q2. XgBoost algorithm has 10-20 features. How are the splits decided, on which feature are they going to be divided?
  • Ans. 

    XgBoost algorithm uses a greedy approach to determine splits based on feature importance.

    • XgBoost algorithm calculates the information gain for each feature to determine the best split.

    • The feature with the highest information gain is chosen for the split.

    • This process is repeated recursively for each node in the tree.

    • Features can be split based on numerical values or categories.

    • Example: If a feature like 'age' has the hi...

  • Answered by AI
  • Q3. Do you have any experience on cloud platform?
  • Q4. What is entropy, information gain?
  • Ans. 

    Entropy is a measure of randomness or uncertainty in a dataset, while information gain is the reduction in entropy after splitting a dataset based on a feature.

    • Entropy is used in decision tree algorithms to determine the best feature to split on.

    • Information gain measures the effectiveness of a feature in classifying the data.

    • Higher information gain indicates that a feature is more useful for splitting the data.

    • Entropy ...

  • Answered by AI
  • Q5. What is hypothesis testing?
  • Q6. Explain precision and recall, when are they used in which scenario?
  • Ans. 

    Precision and recall are metrics used in evaluating the performance of classification models.

    • Precision measures the accuracy of positive predictions, while recall measures the ability of the model to find all positive instances.

    • Precision = TP / (TP + FP)

    • Recall = TP / (TP + FN)

    • Precision is important when false positives are costly, while recall is important when false negatives are costly.

    • For example, in a spam email de...

  • Answered by AI
  • Q7. What is data imbalance?
  • Ans. 

    Data imbalance refers to unequal distribution of classes in a dataset, where one class has significantly more samples than others.

    • Data imbalance can lead to biased models that favor the majority class.

    • It can result in poor performance for minority classes, as the model may struggle to accurately predict them.

    • Techniques like oversampling, undersampling, and using different evaluation metrics can help address data imbala...

  • Answered by AI
  • Q8. What is SMOTE? Do you have any experience working on Time Series? Code analysis of global variable?
  • Ans. 

    SMOTE stands for Synthetic Minority Over-sampling Technique, used to balance imbalanced datasets by generating synthetic samples.

    • SMOTE is commonly used in machine learning to address class imbalance by creating synthetic samples of the minority class.

    • It works by generating new instances of the minority class by interpolating between existing instances.

    • SMOTE is particularly useful in scenarios where the minority class i...

  • Answered by AI
  • Q9. Find 5th highest salary in every department. What are window functions Difference between union and union all Difference between delete and truncate.
  • Ans. 

    Find the 5th highest salary in each department using SQL queries and understand key SQL concepts.

    • Use the ROW_NUMBER() window function to rank salaries within each department.

    • Example SQL: SELECT department, salary FROM (SELECT department, salary, ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank FROM employees) AS ranked WHERE rank = 5;

    • Window functions allow calculations across a set of table rows...

  • Answered by AI

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare basics well. Go through the top questions asked for SQL,Python,Data Science.
Well versed with resume projects and concepts used in it.

Skills evaluated in this interview

Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Referral and was interviewed in Jul 2024. There were 2 interview rounds.

Round 1 - Coding Test 

Basic operations on dataframe using Pandas and SQL basics.

Round 2 - Technical 

(2 Questions)

  • Q1. Data preprocessing related questions like steps took. Experience about working projects.
  • Q2. Random forest and decision tress related questions
Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. Knn and logistic regression
  • Q2. Correlation vs covariance
  • Ans. 

    Covariance measures the relationship between two variables, while correlation measures the strength and direction of the relationship.

    • Covariance can be positive, negative, or zero, indicating the direction of the relationship between variables.

    • Correlation is always between -1 and 1, with 1 indicating a perfect positive relationship, -1 indicating a perfect negative relationship, and 0 indicating no relationship.

    • Covaria...

  • Answered by AI
Interview experience
4
Good
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. Basic Statistics
  • Q2. Basic ML, DL question

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare well with basics
Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
2-4 weeks
Result
-

I applied via Newspaper Ad and was interviewed in Dec 2023. There were 2 interview rounds.

Round 1 - Technical 

(2 Questions)

  • Q1. Questions on Bert , lstm
  • Q2. Questions on bi lstm gpt
Round 2 - HR 

(1 Question)

  • Q1. Salary negotiations and bonus

Interview Preparation Tips

Topics to prepare for Infosys Data Scientist interview:
  • Machine Learning
Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
6-8 weeks
Result
Not Selected

I applied via Recruitment Consulltant and was interviewed in Feb 2024. There was 1 interview round.

Round 1 - Technical 

(1 Question)

  • Q1. What is L1 and L2 Regularization?
  • Ans. 

    L1 and L2 regularization are techniques used in machine learning to prevent overfitting by adding penalty terms to the cost function.

    • L1 regularization adds the absolute values of the coefficients as penalty term to the cost function.

    • L2 regularization adds the squared values of the coefficients as penalty term to the cost function.

    • L1 regularization can lead to sparse models by forcing some coefficients to be exactly zer...

  • Answered by AI

Skills evaluated in this interview

Interview experience
4
Good
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(1 Question)

  • Q1. What projects you worked on
Interview experience
3
Average
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Company Website and was interviewed before Feb 2023. There was 1 interview round.

Round 1 - Technical 

(2 Questions)

  • Q1. Attended the interview on April 2023. 2 panel members joined the interview. Most of the questions are from basic ML and DL concepts. Same day I received the documents upload email and had salary discussio...
  • Q2. What is activation function? Explain Naive Bayes? Confusion matrix? Hyperparameters in DL? Hypothesis testing
  • Ans. 

    Activation function is a mathematical function used in neural networks to introduce non-linearity.

    • Activation function is applied to the weighted sum of inputs in a neural network node.

    • It helps in determining the output of a node or the activation of a neuron.

    • Common activation functions include sigmoid, tanh, ReLU, and softmax.

    • Activation functions introduce non-linearity, allowing neural networks to learn complex patter...

  • Answered by AI

Skills evaluated in this interview

Interview experience
3
Average
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(3 Questions)

  • Q1. What is multi collinearity?
  • Q2. Machine learning algorithms - decsisin tree?
  • Q3. Solve Try catch block

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare basics of machine Learning algorithm. And have generalised overview of latest technology.

Skills evaluated in this interview

What people are saying about Infosys

View All
lesspine
Verified Icon
5d
works at
Infosys
Seeking insights on TCS offer letter
Hii All, I have attended interview for service desk role in tcs at the end of June my tech round and managerial round later I have submitted all my documents in ibegin portal all are showing verified in the portal and in the 2nd week of July I have completed my hr round and later multiple follow ups given update like internal approvals will take time it's been more than month I have contacted hr and he said like internal approvals will take time I asked will it be a month she said it will take more than a month no clear timeline. So will I get offer letter or not seeking insights on this.
Got a question about Infosys?
Ask anonymously on communities.

Infosys Interview FAQs

How many rounds are there in Infosys Data Scientist interview?
Infosys interview process usually has 1-2 rounds. The most common rounds in the Infosys interview process are Technical, Resume Shortlist and HR.
How to prepare for Infosys Data Scientist interview?
Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at Infosys. The most common topics and skills that interviewers at Infosys expect are Python, Data Science, SQL, Machine Learning and R.
What are the top questions asked in Infosys Data Scientist interview?

Some of the top questions asked at the Infosys Data Scientist interview -

  1. XgBoost algorithm has 10-20 features. How are the splits decided, on which feat...read more
  2. Explain precision and recall, when are they used in which scenar...read more
  3. What is activation function? Explain Naive Bayes? Confusion matrix? Hyperparame...read more

Tell us how to improve this page.

Overall Interview Experience Rating

3.9/5

based on 17 interview experiences

Difficulty level

Moderate 100%

Duration

Less than 2 weeks 50%
2-4 weeks 20%
6-8 weeks 20%
More than 8 weeks 10%
View more
Join Infosys Creating the next opportunity for people, businesses & communities

Interview Questions from Similar Companies

TCS Data Scientist Interview Questions
3.5
 • 11.2k Interviews
Wipro Data Scientist Interview Questions
3.7
 • 6.2k Interviews
HCLTech Data Scientist Interview Questions
3.5
 • 4.2k Interviews
Genpact Data Scientist Interview Questions
3.7
 • 3.5k Interviews
IBM Data Scientist Interview Questions
3.9
 • 2.5k Interviews
View all
Infosys Data Scientist Salary
based on 575 salaries
₹5.8 L/yr - ₹17.6 L/yr
25% less than the average Data Scientist Salary in India
View more details

Infosys Data Scientist Reviews and Ratings

based on 19 reviews

4.2/5

Rating in categories

3.9

Skill development

4.0

Work-life balance

3.2

Salary

4.4

Job security

4.1

Company culture

3.3

Promotions

3.8

Work satisfaction

Explore 19 Reviews and Ratings
Data Scientist

Pune,

Bangalore / Bengaluru

+1

9-14 Yrs

Not Disclosed

Data Scientist

Pune,

Delhi/Ncr

+1

5-10 Yrs

Not Disclosed

Explore more jobs
Technology Analyst
55k salaries
unlock blur

₹4.8 L/yr - ₹10 L/yr

Senior Systems Engineer
54.2k salaries
unlock blur

₹2.5 L/yr - ₹6.3 L/yr

Technical Lead
35.4k salaries
unlock blur

₹9.5 L/yr - ₹16.5 L/yr

System Engineer
32.7k salaries
unlock blur

₹2.4 L/yr - ₹5.5 L/yr

Senior Associate Consultant
31.1k salaries
unlock blur

₹8.3 L/yr - ₹15 L/yr

Explore more salaries
Compare Infosys with

TCS

3.5
Compare

Wipro

3.7
Compare

Cognizant

3.7
Compare

Accenture

3.7
Compare
write
Share an Interview