Senior Data Analyst
200+ Senior Data Analyst Interview Questions and Answers

Asked in Chubb

Q. Table A has values 1, 1, 0, 0, null, and Table B has values 1, 0, null, null. What are the resultant rows for all types of joins?
The resultant rows for all joins between table a and table b with given values.
Inner join: 1
Left join: 1, 1, 0, 0, null
Right join: 1, 0, null, null
Full outer join: 1, 1, 0, 0, null, null

Asked in INDIUM

Q. If you wanted to open a restaurant, which data points would you consider?
Key data points to consider for opening restaurants
Location demographics and foot traffic
Competitor analysis and market saturation
Consumer preferences and trends
Economic factors and disposable income levels

Asked in Chubb

Q. Write a query to separate first name, middle name, and last name from a full name in SQL.
Use SUBSTRING_INDEX function in SQL to separate first name, middle name, and last name from full name.
Use SUBSTRING_INDEX function to extract first name by specifying space as delimiter
Use SUBSTRING_INDEX function to extract last name by specifying space as delimiter and -1 as position
Use combination of SUBSTRING_INDEX and REPLACE functions to extract middle name if present

Asked in Tessolve

Q. Create a dataframe and perform data manipulation operations on it.
Creating a dataframe and performing data manipulations in Python using pandas library.
Import pandas library
Create a dictionary with data
Convert dictionary to dataframe using pd.DataFrame()
Perform operations like filtering, sorting, grouping, etc.

Asked in Accenture

Q. What is clustering? And how do you check if a ML model is performing well?
Clustering is a machine learning technique used to group similar data points together. Model performance can be checked using evaluation metrics like silhouette score or inertia.
Clustering is the process of dividing a set of data points into groups based on their similarities.
Common clustering algorithms include K-means, hierarchical clustering, and DBSCAN.
To check if a ML model is performing well, evaluation metrics like silhouette score, inertia, or Davies-Bouldin index can...read more

Asked in Wipro

Q. Describe a query you've written that uses aggregate functions, subqueries, and SQL functions.
Understanding aggregated functions and subqueries in SQL is crucial for data analysis.
Aggregated functions like SUM, AVG, COUNT help summarize data. Example: SELECT AVG(salary) FROM employees;
Subqueries can be used to filter results. Example: SELECT * FROM employees WHERE salary > (SELECT AVG(salary) FROM employees);
SQL functions like COALESCE can handle NULL values. Example: SELECT COALESCE(bonus, 0) FROM employees;
Combining aggregated functions with GROUP BY allows for grou...read more
Senior Data Analyst Jobs




Asked in Inchcape Shipping Services

Q. How do you approach analyzing raw data?
I approach analyzing raw data by first understanding the data sources, cleaning and preprocessing the data, performing exploratory data analysis, applying statistical methods and machine learning techniques, and finally interpreting and communicating the results.
Understand the data sources and collection methods
Clean and preprocess the data to handle missing values, outliers, and inconsistencies
Perform exploratory data analysis to gain insights and identify patterns
Apply stat...read more
Asked in ESG Book

Q. What is esg and what do you know about data analysis
ESG stands for Environmental, Social, and Governance. Data analysis involves collecting, organizing, and analyzing data to extract meaningful insights.
ESG refers to a set of criteria used to evaluate a company's performance in terms of sustainability and ethical practices
Data analysis involves cleaning, transforming, and modeling data to identify trends, patterns, and relationships
Examples of data analysis techniques include regression analysis, clustering, and machine learni...read more
Share interview questions and help millions of jobseekers 🌟

Asked in Elsevier

Q. When can you submit your Substate and Subcity Project?
The Substate and Subcity Project can be submitted once all data analysis and validation are complete, ensuring accuracy and reliability.
Data Collection: Ensure all relevant data from Substate and Subcity is gathered and organized before submission.
Analysis Completion: Submit only after thorough analysis, including trend identification and insights extraction.
Validation: Confirm that all findings are validated against original data sources to maintain integrity.
Stakeholder Rev...read more
Asked in Dataside

Q. How could you rewrite this DAX measure in different ways while achieving the same result?
Explore various DAX measure rewrites to achieve the same analytical outcome in Power BI.
Use CALCULATE with different filter contexts: e.g., CALCULATE(SUM(Sales[Amount]), Sales[Region] = 'North')
Utilize variables for clarity: e.g., VAR TotalSales = SUM(Sales[Amount]) RETURN TotalSales
Implement FILTER function: e.g., SUMX(FILTER(Sales, Sales[Region] = 'North'), Sales[Amount])
Combine measures: e.g., Create a base measure and reference it in another measure for different contexts...read more

Asked in MathCo

Q. How do you write code to convert Roman numerals into Indo-Arabic numbers?
This code converts Roman numerals to Indo-Arabic numbers using a mapping of values and a loop to calculate the total.
Mapping Values: Create a dictionary to map Roman numerals to their corresponding values, e.g., {'I': 1, 'V': 5, 'X': 10, 'L': 50, 'C': 100, 'D': 500, 'M': 1000}.
Iterate Through Characters: Loop through each character in the Roman numeral string from left to right.
Check for Subtraction: If the current numeral is less than the next numeral, subtract its value; ot...read more

Asked in MathCo

Q. What are the challenges faced in a supply chain industry with respect to RGM initiatives?
RGM initiatives in supply chain face challenges like data integration, demand forecasting, and stakeholder alignment.
Data Integration: Difficulty in consolidating data from various sources, leading to inconsistent insights. For example, merging sales data from different regions.
Demand Forecasting: Inaccurate predictions can lead to overstock or stockouts. For instance, seasonal demand spikes not being anticipated.
Stakeholder Alignment: Different departments may have conflicti...read more

Asked in Ganit Inc

Q. What is lambda function? What all data types it supports?
A lambda function is a small anonymous function defined with the lambda keyword in Python.
Lambda functions can take any number of arguments but can only have one expression.
They are often used for short, throwaway functions where defining a full function is unnecessary.
Example: square = lambda x: x ** 2; print(square(5)) outputs 25.
Lambda functions can be used with functions like map(), filter(), and reduce().
Example: list(map(lambda x: x + 1, [1, 2, 3])) outputs [2, 3, 4].

Asked in Samsung

Q. Write a SQL query to retrieve the department names along with the total number of employees in each department.
The SQL query retrieves department names and the total number of employees in each department.
Use the SELECT statement to retrieve the department names and count of employees.
Join the employee table with the department table using the department ID.
Group the results by department name.
Use the COUNT function to calculate the total number of employees in each department.
Asked in Data Dynamics Inc

Q. Explain the difference between ROW_NUMBER(), RANK(), and DENSE_RANK() and provide a use case for each.
ROW_NUMBER(), RANK(), and DENSE_RANK() are SQL window functions for ranking rows in a dataset based on specified criteria.
ROW_NUMBER(): Assigns a unique sequential integer to rows within a partition, starting at 1. Example: Ranking employees by hire date.
RANK(): Assigns a rank to each row within a partition, with gaps in ranking for ties. Example: Ranking students by exam scores.
DENSE_RANK(): Similar to RANK(), but without gaps in ranking for ties. Example: Ranking products b...read more

Asked in Anblicks

Q. Can you explain your experience with data cleaning and preprocessing?
I have extensive experience in data cleaning and preprocessing, ensuring data quality for analysis.
Identified and handled missing values using techniques like imputation or removal, e.g., replacing missing age values with the median.
Standardized data formats, such as converting date formats to a consistent 'YYYY-MM-DD' for easier analysis.
Removed duplicates to ensure data integrity, e.g., eliminating repeated patient records in a healthcare dataset.
Normalized data to bring di...read more

Asked in Ernst & Young

Q. What is the difference between a direct connection and a live connection?
Direct connection is a static connection to a data source, while live connection is a dynamic connection that updates in real-time.
Direct connection requires data to be imported into the analysis tool, while live connection accesses data directly from the source.
Live connection allows for real-time updates and analysis, while direct connection requires manual refreshing.
Direct connection is useful for small datasets or when data is not frequently updated, while live connectio...read more

Asked in Sub-K IMPACT Solutions

Q. How do you find null values in data using SQL?
To find null values from a data using SQL, use the IS NULL operator.
Use the IS NULL operator to check for null values in a specific column.
Combine the IS NULL operator with the WHERE clause to filter out null values.
Use the COUNT function to count the number of null values in a column.
Example: SELECT * FROM table_name WHERE column_name IS NULL;
Example: SELECT COUNT(column_name) FROM table_name WHERE column_name IS NULL;

Asked in BeeHyv

Q. Write DAX to calculate some values.
Calculating values using DAX in Power BI
Use SUMX function to calculate sum of values in a column
Use CALCULATE function to apply filters and conditions to calculations
Use DAX measures to create reusable calculations
Example: Calculate total sales for a specific product category

Asked in Ernst & Young

Q. What is incremental refresh in Power BI?
Incremental refresh is a feature in Power BI that allows you to refresh only a portion of your data instead of the entire dataset.
Incremental refresh is useful for large datasets that take a long time to refresh.
It allows you to refresh only the new or updated data since the last refresh.
You can set up incremental refresh by defining a range of values for a column that represents the refresh window.
For example, you can set up incremental refresh to only refresh data from the ...read more

Asked in Infosys

Q. Power Bi Pro allow up to 8 scheduled refreshes per dataset per day. Power Bi Premium permits up to 48 scheduled refreshes per dataset per day
Power BI Pro allows 8 scheduled refreshes daily, while Power BI Premium supports up to 48, enhancing data update capabilities.
Scheduled Refresh: Power BI Pro users can schedule up to 8 refreshes per dataset daily, ensuring data is updated regularly.
Increased Capacity: Power BI Premium users benefit from 48 refreshes, ideal for organizations needing real-time data insights.
Use Case Example: A retail company using Power BI Premium can refresh sales data every 30 minutes, provid...read more

Asked in PhonePe

Q. Given three tables and a specific condition, how would you find the 2nd, 3rd, and 4th maximum values?
Identify the top 2, 3, or 4 maximum values from three given tables based on specified conditions.
Understand the structure of the tables and the relationships between them.
Use SQL queries like 'SELECT MAX(column_name) FROM table_name' to find maximum values.
Consider using 'ORDER BY' clause to sort values and 'LIMIT' to restrict the number of results.
Example: To find the top 3 salaries from an employee table, use 'SELECT salary FROM employees ORDER BY salary DESC LIMIT 3'.

Asked in INDIUM

Q. Write SQL code to get the second order for each customer for each day.
SQL code to retrieve 2nd order for each customer for each day
Use a subquery to rank orders for each customer by date
Filter the results to only include the 2nd order for each customer for each day

Asked in Dailyhunt

Q. Write an SQL query using RANK(), CTE, SUM(), and a combination of aggregate functions.
Using SQL functions like RANK(), CTE, and SUM() to analyze data effectively.
RANK() assigns a unique rank to each row within a partition based on specified criteria.
Commonly used with CTE (Common Table Expressions) to simplify complex queries.
SUM() can be combined with GROUP BY to aggregate data, providing insights into totals.
Example: Using RANK() to find top salespersons in each region based on total sales.
CTE can be used to first calculate total sales, then apply RANK() to ...read more

Asked in Razorpay

Q. calc retained revived new customer per month basis what are joins in sql what is dense rank
To calculate retained, revived, and new customers per month basis, joins in SQL and dense rank are used.
To calculate retained, revived, and new customers per month basis, you would typically use SQL queries to join different tables containing customer data.
Joins in SQL are used to combine rows from two or more tables based on a related column between them.
Dense rank is a function in SQL that assigns a rank to each row within a partition of a result set, with no gaps in the ra...read more

Asked in iCIMS

Q. What are the ways to optimize a Tableau dashboard?
Optimizing Tableau dashboard involves improving performance, usability, and visual appeal.
Limit the number of worksheets on a dashboard to improve performance
Use filters and parameters effectively to allow for interactivity
Optimize data source connections for faster loading times
Utilize dashboard actions to create dynamic interactions
Design with user experience in mind to ensure clarity and ease of use
Asked in Galaxy Co-operative Credit Society

Q. What are the key skills usually required for a data analyst?
Key skills for a data analyst include analytical thinking, technical proficiency, and effective communication.
Analytical Skills: Ability to interpret complex data sets and identify trends, e.g., using statistical methods to analyze sales data.
Technical Proficiency: Familiarity with tools like SQL, Python, or R for data manipulation and analysis.
Data Visualization: Skills in tools like Tableau or Power BI to create clear and insightful visual representations of data.
Attention ...read more

Asked in TCS

Q. How many days do you need to learn the TCS process?
Learning TCS processes typically takes 2-4 weeks, depending on complexity and prior experience.
Familiarization with TCS tools and software can take about a week.
Understanding specific processes may require 1-2 weeks of hands-on training.
Collaboration with team members can accelerate learning through shared knowledge.
Regular feedback sessions can help clarify doubts and improve understanding.

Asked in Decathlon

Q. What is the difference between a tree map and a heat map in Tableau?
Tree map displays hierarchical data using nested rectangles, while heat map shows data values using color intensity.
Tree map visualizes hierarchical data with nested rectangles, where the size of each rectangle represents a measure.
Heat map displays data values using color intensity, with darker colors indicating higher values.
Tree map is useful for showing proportions within a whole, while heat map is effective for identifying patterns and trends in data.
Example: A tree map ...read more

Asked in Ganit Inc

Q. How do you resolve issues occurring between teams?
Effective communication and collaboration are key to resolving inter-team issues.
Identify the root cause: Conduct meetings to understand the specific issues causing friction between teams.
Encourage open communication: Foster an environment where team members feel comfortable sharing their concerns and suggestions.
Establish clear roles and responsibilities: Ensure that each team understands their responsibilities to avoid overlap and confusion.
Facilitate team-building activiti...read more
Interview Questions of Similar Designations
Interview Experiences of Popular Companies





Top Interview Questions for Senior Data Analyst Related Skills



Reviews
Interviews
Salaries
Users

