i
TCS
Filter interviews by
Recall is the ratio of correctly predicted positive observations to the all observations in actual class, while precision is the ratio of correctly predicted positive observations to the total predicted positive observations.
Recall is about the actual positive instances that were correctly identified by the model.
Precision is about the predicted positive instances and how many of them were actually positive.
Recall...
I would train a decision tree model as it can handle categorical data well with minimal data.
Decision tree models are suitable for categorical prediction with minimal data
They can handle both numerical and categorical data
Decision trees are easy to interpret and visualize
Examples: predicting customer churn, classifying spam emails
Bias-variance trade off is the balance between underfitting and overfitting in machine learning models.
Bias refers to error from erroneous assumptions in the learning algorithm, leading to underfitting.
Variance refers to error from sensitivity to small fluctuations in the training set, leading to overfitting.
The trade off involves finding the right level of model complexity to minimize both bias and variance.
Regul...
K-means is a clustering algorithm while KNN is a classification algorithm.
K-means is unsupervised learning, KNN is supervised learning
K-means partitions data into K clusters based on distance, KNN classifies data points based on similarity to K neighbors
K-means requires specifying the number of clusters (K), KNN requires specifying the number of neighbors (K)
Example: K-means can be used to group customers based on...
What people are saying about TCS
RNN uses techniques like gradient clipping, weight initialization, and LSTM/GRU cells to handle exploding/vanishing gradients.
Gradient clipping limits the magnitude of gradients during backpropagation.
Weight initialization techniques like Xavier initialization help in preventing vanishing gradients.
LSTM/GRU cells have gating mechanisms that allow the network to selectively remember or forget information.
Batch norm...
Faster-RCNN and Yolo v3 are both object detection algorithms, but differ in their approach and performance.
Faster-RCNN uses a two-stage approach, first generating region proposals and then classifying them.
Yolo v3 uses a single-stage approach, directly predicting bounding boxes and class probabilities.
Faster-RCNN is generally more accurate but slower, while Yolo v3 is faster but less accurate.
Faster-RCNN is better...
TSA stands for Transportation Security Administration. It is a government agency responsible for security at airports and other transportation hubs.
TSA was created in response to the September 11, 2001 terrorist attacks in the United States.
Its main goal is to ensure the security of passengers and transportation infrastructure.
TSA agents screen passengers and luggage for prohibited items before they board flights.
...
Stop words are common words like 'the', 'is', 'and' that are removed from text data to improve analysis.
Stop words are commonly removed from text data to improve the accuracy of natural language processing tasks.
They are typically removed before tokenization and can be done using libraries like NLTK or spaCy.
Examples of stop words include 'the', 'is', 'and', 'in', 'on', etc.
Logistic regression is a statistical model used to predict the probability of a binary outcome based on one or more predictor variables.
Logistic regression is used when the dependent variable is binary (e.g., 0/1, yes/no, true/false).
It estimates the probability that a given observation belongs to a particular category.
It uses the logistic function to model the relationship between the dependent variable and indep...
Decision tree algorithm is a tree-like model used for classification and regression. Cross entropy is a measure of the difference between two probability distributions.
Decision tree algorithm recursively splits the data into subsets based on the most significant attribute until a stopping criterion is met.
It is a popular algorithm for both classification and regression tasks.
Cross entropy is used as a loss functio...
I appeared for an interview in Oct 2024.
Transfer learning involves using pre-trained models on a different task, while fine-tuning involves further training a pre-trained model on a specific task.
Transfer learning uses knowledge gained from one task to improve learning on a different task.
Fine-tuning involves adjusting the parameters of a pre-trained model to better fit a specific task.
Transfer learning is faster and requires less data compared to training a...
I appeared for an interview in Jan 2025.
Supervised learning algorithms are used in machine learning to predict outcomes based on labeled training data.
Supervised learning algorithms require labeled training data to learn the relationship between input and output variables.
Common supervised learning algorithms include linear regression, logistic regression, decision trees, support vector machines, and neural networks.
These algorithms are used for tasks such a...
Unsupervised learning algorithms are used to find patterns in data without labeled outcomes.
Unsupervised learning algorithms do not require labeled data for training.
They are used for clustering, dimensionality reduction, and anomaly detection.
Examples include K-means clustering, hierarchical clustering, and principal component analysis.
Cosine similarity measures the similarity between two non-zero vectors in an inner product space.
Cosine similarity ranges from -1 to 1, with 1 indicating identical vectors and -1 indicating opposite vectors.
It is commonly used in information retrieval, text mining, and recommendation systems.
Formula: cos(theta) = (A . B) / (||A|| * ||B||)
Example: Calculating similarity between two documents based on their word frequenc...
Recall is the ratio of correctly predicted positive observations to the all observations in actual class, while precision is the ratio of correctly predicted positive observations to the total predicted positive observations.
Recall is about the actual positive instances that were correctly identified by the model.
Precision is about the predicted positive instances and how many of them were actually positive.
Recall = Tr...
Stop words are common words like 'the', 'is', 'and' that are removed from text data to improve analysis.
Stop words are commonly removed from text data to improve the accuracy of natural language processing tasks.
They are typically removed before tokenization and can be done using libraries like NLTK or spaCy.
Examples of stop words include 'the', 'is', 'and', 'in', 'on', etc.
Confusion matrix is a table used to evaluate the performance of a classification model.
It is a 2x2 matrix that shows the counts of true positive, true negative, false positive, and false negative predictions.
It is used to calculate metrics like accuracy, precision, recall, and F1 score.
Example: TP=100, TN=50, FP=10, FN=5.
Similarity matrix algo is a method to quantify the similarity between data points in a dataset.
It calculates the similarity between each pair of data points in a dataset and represents it in a matrix form.
Common similarity measures used include cosine similarity, Euclidean distance, and Jaccard similarity.
The diagonal of the matrix usually contains 1s as each data point is perfectly similar to itself.
The values in the ...
I have a strong background in data analysis, machine learning, and problem-solving skills.
Extensive experience in data analysis and machine learning algorithms
Proven track record of solving complex problems using data-driven approaches
Strong communication skills to effectively convey insights and recommendations
Ability to work collaboratively in a team environment
Passion for continuous learning and staying updated with...
I expect a collaborative environment, opportunities for growth, and a focus on impactful projects that leverage data for decision-making.
A collaborative culture where team members share knowledge and support each other, like regular brainstorming sessions.
Opportunities for professional development, such as workshops or courses in advanced data science techniques.
Engagement in meaningful projects that have a real-world ...
Developed a predictive model for customer churn in a telecom company.
Used machine learning algorithms like logistic regression and random forest.
Performed feature engineering to extract relevant customer behavior patterns.
Evaluated model performance using metrics like accuracy, precision, and recall.
Steps involved in Machine Learning Problem Statement
Define the problem statement and goals
Collect and preprocess data
Select a machine learning model
Train the model on the data
Evaluate the model's performance
Fine-tune the model if necessary
Deploy the model for predictions
I applied via Naukri.com and was interviewed in Jan 2024. There was 1 interview round.
Retraining GEN AI model involves updating the model with new data to improve its accuracy and performance.
Retraining is necessary to keep the model up-to-date with new information.
New data is used to fine-tune the model's parameters and improve its predictions.
Retraining may involve adjusting hyperparameters, adding more layers, or changing the architecture.
Examples: retraining a language model with new text data, retr...
MLFlow allows for easy deployment of machine learning models.
MLFlow provides a simple way to deploy models using the mlflow models serve command.
Models can be deployed locally or to a cloud-based server for production use.
MLFlow also supports model versioning and tracking for easy management of deployed models.
Coding test on python to test skills
Case study on statistics
The duration of TCS Data Scientist interview process can vary, but typically it takes about less than 2 weeks to complete.
based on 36 interview experiences
Difficulty level
Duration
based on 142 reviews
Rating in categories
System Engineer
1.1L
salaries
| ₹1 L/yr - ₹9 L/yr |
IT Analyst
65.6k
salaries
| ₹5.1 L/yr - ₹16.8 L/yr |
AST Consultant
53.4k
salaries
| ₹8 L/yr - ₹25 L/yr |
Assistant System Engineer
33.2k
salaries
| ₹2.6 L/yr - ₹6.4 L/yr |
Associate Consultant
32.8k
salaries
| ₹9 L/yr - ₹33.6 L/yr |
Amazon
Wipro
Infosys
Accenture