i
Infosys
Work with us
Filter interviews by
OOP concepts include encapsulation, inheritance, polymorphism, and abstraction, essential for structured programming.
Encapsulation: Bundling data and methods in a class. Example: A class 'Car' with attributes like 'speed' and methods like 'accelerate()'.
Inheritance: Deriving new classes from existing ones. Example: 'ElectricCar' inherits from 'Car', adding features like 'batteryCapacity'.
Polymorphism: Ability to t...
XgBoost algorithm uses a greedy approach to determine splits based on feature importance.
XgBoost algorithm calculates the information gain for each feature to determine the best split.
The feature with the highest information gain is chosen for the split.
This process is repeated recursively for each node in the tree.
Features can be split based on numerical values or categories.
Example: If a feature like 'age' has t...
Precision and recall are metrics used in evaluating the performance of classification models.
Precision measures the accuracy of positive predictions, while recall measures the ability of the model to find all positive instances.
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
Precision is important when false positives are costly, while recall is important when false negatives are costly.
For example, in a spam ema...
Data imbalance refers to unequal distribution of classes in a dataset, where one class has significantly more samples than others.
Data imbalance can lead to biased models that favor the majority class.
It can result in poor performance for minority classes, as the model may struggle to accurately predict them.
Techniques like oversampling, undersampling, and using different evaluation metrics can help address data i...
XGBoost is a powerful machine learning algorithm known for its speed and performance in handling large datasets.
XGBoost stands for eXtreme Gradient Boosting, which is an implementation of gradient boosting machines.
It is widely used in machine learning competitions and is known for its speed and performance.
XGBoost uses a technique called boosting, where multiple weak learners are combined to create a strong learn...
L1 and L2 regularization are techniques used in machine learning to prevent overfitting by adding penalty terms to the cost function.
L1 regularization adds the absolute values of the coefficients as penalty term to the cost function.
L2 regularization adds the squared values of the coefficients as penalty term to the cost function.
L1 regularization can lead to sparse models by forcing some coefficients to be exactl...
Data science is the field of extracting insights and knowledge from data using various techniques and tools.
Data science involves collecting, cleaning, and analyzing data to extract insights.
It uses various techniques such as machine learning, statistical modeling, and data visualization.
Data science is used in various fields such as finance, healthcare, and marketing.
Examples of data science applications include ...
SMOTE stands for Synthetic Minority Over-sampling Technique, used to balance imbalanced datasets by generating synthetic samples.
SMOTE is commonly used in machine learning to address class imbalance by creating synthetic samples of the minority class.
It works by generating new instances of the minority class by interpolating between existing instances.
SMOTE is particularly useful in scenarios where the minority cl...
Entropy is a measure of randomness or uncertainty in a dataset, while information gain is the reduction in entropy after splitting a dataset based on a feature.
Entropy is used in decision tree algorithms to determine the best feature to split on.
Information gain measures the effectiveness of a feature in classifying the data.
Higher information gain indicates that a feature is more useful for splitting the data.
Ent...
Activation function is a mathematical function used in neural networks to introduce non-linearity.
Activation function is applied to the weighted sum of inputs in a neural network node.
It helps in determining the output of a node or the activation of a neuron.
Common activation functions include sigmoid, tanh, ReLU, and softmax.
Activation functions introduce non-linearity, allowing neural networks to learn complex p...
I appeared for an interview in Feb 2025.
RAG (Retrieval-Augmented Generation) deployment enhances AI models by integrating external data sources for improved responses.
Integrate RAG with existing NLP models to enhance context understanding.
Utilize APIs to fetch real-time data, improving response accuracy.
Example: Using RAG in customer support to pull relevant FAQs from a database.
Implement caching mechanisms to optimize retrieval speed.
Monitor and evaluate mo...
RAG (Red, Amber, Green) is a visual tool for assessing project status and risk levels.
RAG status indicates project health: Red = critical issues, Amber = potential risks, Green = on track.
Example: A project with budget overruns may be marked Red.
RAG can be used in dashboards for quick visual assessments.
Regular updates to RAG status help in proactive risk management.
I applied via Job Portal and was interviewed in Apr 2024. There was 1 interview round.
XGBoost is a powerful machine learning algorithm known for its speed and performance in handling large datasets.
XGBoost stands for eXtreme Gradient Boosting, which is an implementation of gradient boosting machines.
It is widely used in machine learning competitions and is known for its speed and performance.
XGBoost uses a technique called boosting, where multiple weak learners are combined to create a strong learner.
It...
XgBoost algorithm uses a greedy approach to determine splits based on feature importance.
XgBoost algorithm calculates the information gain for each feature to determine the best split.
The feature with the highest information gain is chosen for the split.
This process is repeated recursively for each node in the tree.
Features can be split based on numerical values or categories.
Example: If a feature like 'age' has the hi...
Entropy is a measure of randomness or uncertainty in a dataset, while information gain is the reduction in entropy after splitting a dataset based on a feature.
Entropy is used in decision tree algorithms to determine the best feature to split on.
Information gain measures the effectiveness of a feature in classifying the data.
Higher information gain indicates that a feature is more useful for splitting the data.
Entropy ...
Precision and recall are metrics used in evaluating the performance of classification models.
Precision measures the accuracy of positive predictions, while recall measures the ability of the model to find all positive instances.
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
Precision is important when false positives are costly, while recall is important when false negatives are costly.
For example, in a spam email de...
Data imbalance refers to unequal distribution of classes in a dataset, where one class has significantly more samples than others.
Data imbalance can lead to biased models that favor the majority class.
It can result in poor performance for minority classes, as the model may struggle to accurately predict them.
Techniques like oversampling, undersampling, and using different evaluation metrics can help address data imbala...
SMOTE stands for Synthetic Minority Over-sampling Technique, used to balance imbalanced datasets by generating synthetic samples.
SMOTE is commonly used in machine learning to address class imbalance by creating synthetic samples of the minority class.
It works by generating new instances of the minority class by interpolating between existing instances.
SMOTE is particularly useful in scenarios where the minority class i...
Find the 5th highest salary in each department using SQL queries and understand key SQL concepts.
Use the ROW_NUMBER() window function to rank salaries within each department.
Example SQL: SELECT department, salary FROM (SELECT department, salary, ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank FROM employees) AS ranked WHERE rank = 5;
Window functions allow calculations across a set of table rows...
I applied via Referral and was interviewed in Jul 2024. There were 2 interview rounds.
Basic operations on dataframe using Pandas and SQL basics.
Covariance measures the relationship between two variables, while correlation measures the strength and direction of the relationship.
Covariance can be positive, negative, or zero, indicating the direction of the relationship between variables.
Correlation is always between -1 and 1, with 1 indicating a perfect positive relationship, -1 indicating a perfect negative relationship, and 0 indicating no relationship.
Covaria...
I applied via Newspaper Ad and was interviewed in Dec 2023. There were 2 interview rounds.
I applied via Recruitment Consulltant and was interviewed in Feb 2024. There was 1 interview round.
L1 and L2 regularization are techniques used in machine learning to prevent overfitting by adding penalty terms to the cost function.
L1 regularization adds the absolute values of the coefficients as penalty term to the cost function.
L2 regularization adds the squared values of the coefficients as penalty term to the cost function.
L1 regularization can lead to sparse models by forcing some coefficients to be exactly zer...
I applied via Company Website and was interviewed before Feb 2023. There was 1 interview round.
Activation function is a mathematical function used in neural networks to introduce non-linearity.
Activation function is applied to the weighted sum of inputs in a neural network node.
It helps in determining the output of a node or the activation of a neuron.
Common activation functions include sigmoid, tanh, ReLU, and softmax.
Activation functions introduce non-linearity, allowing neural networks to learn complex patter...
What people are saying about Infosys
Some of the top questions asked at the Infosys Data Scientist interview -
based on 17 interview experiences
Difficulty level
Duration
based on 19 reviews
Rating in categories
Technology Analyst
55k
salaries
| ₹4.8 L/yr - ₹10 L/yr |
Senior Systems Engineer
54.2k
salaries
| ₹2.5 L/yr - ₹6.3 L/yr |
Technical Lead
35.4k
salaries
| ₹9.5 L/yr - ₹16.5 L/yr |
System Engineer
32.7k
salaries
| ₹2.4 L/yr - ₹5.5 L/yr |
Senior Associate Consultant
31.1k
salaries
| ₹8.3 L/yr - ₹15 L/yr |
TCS
Wipro
Cognizant
Accenture