Junior Data Analyst

100+ Junior Data Analyst Interview Questions and Answers

Updated 18 Jan 2025
search-icon

Q1. What is the main difference between data mining and data analysis?

Ans.

Data mining involves discovering patterns and relationships in large datasets, while data analysis focuses on interpreting and drawing insights from data.

  • Data mining is the process of extracting useful information from large datasets.

  • Data analysis involves examining and interpreting data to draw conclusions and make informed decisions.

  • Data mining uses techniques like clustering, classification, and association to discover patterns and relationships.

  • Data analysis involves tech...read more

Q2. How do you use 'PARTITION BY` and `ORDER BY in window functions

Ans.

PARTITION BY is used to divide the result set into partitions, while ORDER BY is used to sort the rows within each partition in window functions.

  • PARTITION BY is used to group rows with the same values in specified columns

  • ORDER BY is used to sort the rows within each partition

  • Example: SELECT column1, column2, SUM(column3) OVER (PARTITION BY column1 ORDER BY column2) AS total FROM table_name

Junior Data Analyst Interview Questions and Answers for Freshers

illustration image

Q3. What is SQL, and why is it important in data analytics

Ans.

SQL is a programming language used for managing and analyzing data in relational databases.

  • SQL stands for Structured Query Language

  • It is used to retrieve, manipulate, and analyze data stored in relational databases

  • SQL is important in data analytics as it allows analysts to query databases to extract relevant information for analysis

  • It helps in filtering, sorting, and aggregating data to generate insights

  • Examples of SQL commands include SELECT, INSERT, UPDATE, and DELETE

Q4. Difference between Adverse Event and Adverse reaction with example.

Ans.

Adverse event is any undesirable medical occurrence while adverse reaction is a specific type of adverse event caused by a medication.

  • Adverse event can be caused by any medical intervention or procedure while adverse reaction is specifically caused by a medication.

  • Adverse event can be expected or unexpected while adverse reaction is always unexpected.

  • Example of adverse event: a patient develops a fever after surgery. Example of adverse reaction: a patient develops a rash afte...read more

Are these interview questions helpful?

Q5. What is the difference between 'WHERE` and 'HAVING` clauses

Ans.

WHERE clause is used to filter rows before grouping, while HAVING clause is used to filter groups after grouping.

  • WHERE clause is used with SELECT, UPDATE, DELETE statements to filter rows based on a condition

  • HAVING clause is used with SELECT statement to filter groups based on a condition

  • WHERE clause is applied before the data is grouped, while HAVING clause is applied after the data is grouped

  • Example: SELECT * FROM table_name WHERE column_name = 'value';

  • Example: SELECT colum...read more

Q6. Explain the main steps involved in data analysis ?

Ans.

Data analysis involves several steps including data collection, data cleaning, data exploration, data modeling, and data visualization.

  • Data collection: Gathering relevant data from various sources.

  • Data cleaning: Removing any errors, inconsistencies, or missing values from the data.

  • Data exploration: Analyzing the data to understand its characteristics and identify patterns or trends.

  • Data modeling: Applying statistical or machine learning techniques to build models and make pre...read more

Share interview questions and help millions of jobseekers 🌟

man-with-laptop

Q7. Explain the difference between 'INNER JOIN', 'LEFT JOIN`, `RIGHT JOIN`, and `FULL OUTER JOIN`.

Ans.

Different types of SQL joins used to combine rows from two or more tables based on a related column between them.

  • INNER JOIN: Returns rows when there is at least one match in both tables.

  • LEFT JOIN: Returns all rows from the left table and the matched rows from the right table.

  • RIGHT JOIN: Returns all rows from the right table and the matched rows from the left table.

  • FULL OUTER JOIN: Returns all rows when there is a match in either left or right table.

Q8. What are Indexing, it's types and use of it

Ans.

Indexing is a technique used to optimize data retrieval in databases by creating indexes on columns.

  • Types of indexing include clustered and non-clustered indexes

  • Clustered indexes physically reorder the data in the table based on the index key

  • Non-clustered indexes create a separate structure to store the index key and a pointer to the actual data

  • Indexes are used to speed up data retrieval operations such as SELECT queries

Junior Data Analyst Jobs

Junior Data Analyst 0-2 years
NielsenIQ
3.8
Chennai
Junior Data Analysts 1-4 years
RMSI Pvt. Ltd
3.6
Noida
ESG Junior Data Analyst 0-3 years
Institutional Shareholder Services Inc.
3.8
Mumbai

Q9. What kind of cases handled and explain in brief

Ans.

Handled cases include data cleaning, analysis, visualization and reporting for various industries.

  • Data cleaning and analysis for a retail company to identify sales trends

  • Visualization of customer behavior for a telecommunications company

  • Reporting on website traffic for an e-commerce business

  • Data analysis for a healthcare provider to improve patient outcomes

  • Cleaning and analyzing survey data for a non-profit organization

Q10. Explain the difference between `TRUNCATE`, `DELETE`, and `DROP` commands.

Ans.

TRUNCATE removes all rows from a table, DELETE removes specific rows, and DROP deletes the entire table structure.

  • TRUNCATE is faster than DELETE as it does not log individual row deletions.

  • DELETE is slower than TRUNCATE as it logs each row deletion.

  • DROP removes the entire table structure along with all data.

  • TRUNCATE and DELETE can be rolled back, but DROP cannot be rolled back.

  • Example: TRUNCATE table_name;

  • Example: DELETE FROM table_name WHERE condition;

  • Example: DROP TABLE tab...read more

Q11. Explain window functions like `ROW_NUMBER()`, `RANK()`, and `DENSE_RANK()`.

Ans.

Window functions like ROW_NUMBER(), RANK(), and DENSE_RANK() assign a unique number to each row based on specified criteria.

  • ROW_NUMBER() assigns a unique sequential integer starting from 1 to each row within a partition

  • RANK() assigns a unique rank to each row within a partition, with no gaps in ranking if there are ties

  • DENSE_RANK() assigns a unique rank to each row within a partition, with possible gaps in ranking if there are ties

Q12. What is a foreign key in the context of relational databases?

Ans.

A foreign key in relational databases is a field that links two tables together, establishing a relationship between them.

  • A foreign key in one table points to the primary key in another table

  • It ensures referential integrity by enforcing relationships between tables

  • Foreign keys help maintain data consistency and prevent orphaned records

  • Example: In a database with tables for 'orders' and 'customers', the 'customer_id' in the 'orders' table would be a foreign key linking to the ...read more

Q13. What is the difference between Data Definition Language (DDL) and Data Manipulation Language (DML)?

Ans.

DDL is used to define the structure of database objects, while DML is used to manipulate data within those objects.

  • DDL is used to create, modify, and delete database objects such as tables, indexes, and views.

  • DML is used to insert, update, retrieve, and delete data within those database objects.

  • DDL statements include CREATE, ALTER, DROP, TRUNCATE, etc.

  • DML statements include SELECT, INSERT, UPDATE, DELETE, etc.

  • DDL changes the structure of the database, while DML changes the co...read more

Q14. Like what is maleria, what is drug ,alergy,hypertension,diabetes,obesity,gerd,gout,hyperlipidermia,what is agar agar,pigment names

Ans.

Malaria is a mosquito-borne infectious disease caused by parasites. Drug allergy is an adverse reaction to medication. Hypertension is high blood pressure. Diabetes is a metabolic disorder affecting blood sugar levels. Obesity is excessive body weight. GERD is gastroesophageal reflux disease. Gout is a form of arthritis. Hyperlipidemia is high levels of lipids in the blood. Agar agar is a gelatinous substance derived from seaweed. Pigment names refer to various coloring agent...read more

Q15. What is the difference between C and C++? What is the use of website testing?

Ans.

C is a procedural programming language while C++ is an object-oriented programming language.

  • C++ is an extension of C with added features like classes, inheritance, and polymorphism.

  • C++ is used for developing software applications, games, and operating systems.

  • Website testing is the process of checking the functionality, usability, and performance of a website.

  • It involves testing the website's links, forms, navigation, and compatibility with different devices and browsers.

  • Webs...read more

Q16. Merge two sorted linked list and from scratch, create class of linked list then create method of generating linked list

Ans.

Merge two sorted linked lists by creating a linked list class and method to generate linked lists from scratch.

  • Create a Node class with data and next pointer

  • Create a LinkedList class with methods to insert nodes and merge two lists

  • Iterate through both lists and compare nodes to merge them in sorted order

Q17. Difference between PowerBI and Tableau Calculated Field in Tableau Difference Between Data Blending and Data Joining

Ans.

PowerBI and Tableau are both popular data visualization tools, but they have some key differences in terms of features and functionality.

  • PowerBI is a Microsoft product, while Tableau is developed by Tableau Software.

  • PowerBI is more user-friendly and integrates well with other Microsoft products, while Tableau offers more advanced visualization capabilities.

  • Tableau has a feature called Calculated Field which allows users to create new fields based on existing data, while Power...read more

Q18. How to find the null values in the given excel sheet

Ans.

Null values in an Excel sheet can be found by using filters or functions like ISBLANK or COUNTBLANK.

  • Use filters to easily identify blank cells in the Excel sheet

  • Use functions like ISBLANK or COUNTBLANK to check for null values in specific cells

  • Look for cells with no data or missing values, which indicate null values

Q19. A practical application of VLOOKUP on a given data

Ans.

VLOOKUP can be used to find specific information in a table by matching a key value.

  • Use VLOOKUP to find a student's grade based on their student ID in a table of student data

  • VLOOKUP can be used to retrieve a customer's contact information based on their customer ID

  • It can also be used to look up product prices based on product codes in a pricing table

Q20. What SQL commands do you know?

Ans.

I am familiar with basic SQL commands such as SELECT, INSERT, UPDATE, DELETE, JOIN, and GROUP BY.

  • SELECT: Retrieve data from a database table

  • INSERT: Add new records to a table

  • UPDATE: Modify existing records in a table

  • DELETE: Remove records from a table

  • JOIN: Combine rows from two or more tables based on a related column

  • GROUP BY: Group rows that have the same values into summary rows

Q21. Results of Left Join, Right Join and Cross Join

Ans.

Left Join includes all records from the left table and matching records from the right table. Right Join includes all records from the right table and matching records from the left table. Cross Join combines all records from both tables.

  • Left Join: Includes all records from the left table and matching records from the right table.

  • Right Join: Includes all records from the right table and matching records from the left table.

  • Cross Join: Combines all records from both tables.

Q22. Row-level Security and 4 role in power Bi

Ans.

Row-level security in Power BI allows restricting access to specific rows of data based on user roles.

  • Row-level security in Power BI is used to control access to data at the row level based on user roles.

  • Roles in Power BI define the level of access users have to data and reports.

  • Examples of roles in Power BI include Admin, Analyst, Viewer, and Contributor.

  • By setting up row-level security, users can only see the data that is relevant to their role.

  • Row-level security can be imp...read more

Q23. What is the diffrence betwe.en credit and debit note

Ans.

Credit note is issued to reduce the amount payable by a customer, while debit note is issued to increase the amount payable by a customer.

  • Credit note is issued when a customer has been overcharged or returned goods, resulting in a reduction of the amount owed.

  • Debit note is issued when a customer has been undercharged or additional goods/services have been provided, resulting in an increase of the amount owed.

  • Credit note decreases the accounts receivable balance, while debit n...read more

Q24. What is data validation?

Ans.

Data validation is the process of ensuring that data is accurate, complete, and consistent.

  • Data validation involves checking data for errors, inconsistencies, and anomalies.

  • It helps to ensure data quality and reliability.

  • Validation can be done through various techniques such as range checks, format checks, and cross-field validation.

  • Examples of data validation include verifying that a phone number has the correct number of digits or that a date is in the correct format.

  • Data v...read more

Q25. Seriousness criteria of cases Explain Congenital Anomaly.

Ans.

Congenital anomaly refers to a physical or structural abnormality present at birth.

  • Seriousness criteria of cases depend on the type and severity of the anomaly.

  • Some congenital anomalies may be minor and have little impact on health, while others can be life-threatening.

  • Examples of congenital anomalies include heart defects, cleft lip and palate, and neural tube defects.

  • Congenital anomalies can be caused by genetic factors, environmental factors, or a combination of both.

  • Early...read more

Q26. What are BookMarks, use of it

Ans.

Bookmarks are digital markers used to quickly navigate to specific sections or pages within a document or website.

  • Bookmarks allow users to easily access important or frequently visited sections of a document or website.

  • They are commonly used in web browsers to save specific web pages for quick access.

  • Bookmarks can also be used in PDF documents to mark important pages or sections for easy reference.

Q27. Difference Between List and Touple in python

Ans.

List is mutable, ordered collection of items while tuple is immutable, ordered collection of items in Python.

  • List is defined using square brackets [] while tuple is defined using parentheses ().

  • Elements in a list can be changed or modified while elements in a tuple cannot be changed.

  • Lists are typically used for collections of similar items while tuples are used for fixed collections of items.

  • Example: list_example = [1, 2, 3] and tuple_example = (4, 5, 6)

Q28. Tuple is immutable, while list is mutable.

Ans.

Tuple is immutable, list is mutable in Python.

  • Tuple elements cannot be changed once assigned, while list elements can be modified.

  • Tuple uses parentheses () and list uses square brackets [] for declaration.

  • Example: tuple_example = (1, 2, 3) vs list_example = [1, 2, 3]

Q29. Define Solicited report and Spontaneous report.

Ans.

Solicited report is a report requested by an authority while spontaneous report is a voluntary report by an individual.

  • Solicited report is requested by an authority or organization.

  • Spontaneous report is voluntary and not requested.

  • Solicited report is usually for a specific purpose or event.

  • Spontaneous report is usually for unexpected events or adverse reactions.

  • Examples of solicited reports include clinical trial reports and regulatory reports.

  • Examples of spontaneous reports ...read more

Q30. Difference between Union & Union all

Ans.

Union combines and removes duplicates, Union all combines without removing duplicates.

  • Union combines result sets and removes duplicates

  • Union all combines result sets without removing duplicates

  • Union is slower than Union all as it involves removing duplicates

  • Union all is faster than Union as it does not remove duplicates

Q31. Explain what data cleansing is

Ans.

Data cleansing is the process of identifying and correcting or removing errors, inconsistencies, and inaccuracies in datasets.

  • Data cleansing involves identifying and handling missing values in datasets.

  • It also includes removing duplicate records or entries.

  • Data cleansing may involve correcting spelling mistakes or formatting issues in data.

  • It helps improve data quality and reliability for analysis and decision-making.

  • Example: Removing rows with missing values, standardizing d...read more

Q32. What is fact & dimensions

Ans.

Facts are measurable data points, while dimensions provide context to the facts by categorizing and organizing them.

  • Facts are quantitative data that can be measured or counted.

  • Dimensions provide context to the facts by categorizing and organizing them.

  • In a sales database, the fact could be the total revenue generated, while dimensions could include product category, region, and time period.

Q33. What makes you to choose data analyst role

Ans.

Passion for uncovering insights from data and making data-driven decisions.

  • Fascination with numbers and patterns

  • Desire to solve complex problems

  • Interest in using data to drive business decisions

  • Ability to communicate findings effectively

Q34. What are types of clinical research phase

Ans.

There are four phases of clinical research: Phase 1, Phase 2, Phase 3, and Phase 4.

  • Phase 1: Focuses on safety and dosage in a small group of healthy volunteers.

  • Phase 2: Expands to a larger group to see if the treatment is effective.

  • Phase 3: Compares the new treatment to standard treatments in a larger group.

  • Phase 4: Post-marketing studies to monitor the treatment's long-term effects.

Q35. What are the coding languages you know

Ans.

I know Python, SQL, and R.

  • Proficient in Python for data analysis and visualization

  • Experience with SQL for data querying and manipulation

  • Familiarity with R for statistical analysis and modeling

Q36. What is pivot table and describe

Ans.

A pivot table is a data summarization tool used to condense and aggregate large datasets.

  • Pivot tables allow users to quickly analyze and manipulate large amounts of data.

  • They can be used to group data by categories and display summarized information.

  • Users can easily change the layout of the table to view data from different perspectives.

  • Pivot tables are commonly used in spreadsheet programs like Microsoft Excel and Google Sheets.

  • For example, a sales team could use a pivot tab...read more

Q37. What is SUSAR and Name of Regulatory Authorities

Ans.

SUSAR stands for Suspected Unexpected Serious Adverse Reaction. Regulatory authorities include FDA, EMA, MHRA, etc.

  • SUSAR refers to adverse reactions that are unexpected, serious, and suspected to be caused by a drug or medical product

  • Regulatory authorities such as FDA (Food and Drug Administration), EMA (European Medicines Agency), MHRA (Medicines and Healthcare products Regulatory Agency) oversee reporting and monitoring of SUSARs

  • Reporting SUSARs is crucial for ensuring the ...read more

Q38. What is Pharmacovigilance and Adverse Event

Ans.

Pharmacovigilance is the science and activities related to the detection, assessment, understanding, and prevention of adverse effects or any other drug-related problems.

  • Pharmacovigilance involves monitoring and evaluating the safety of pharmaceutical products.

  • Adverse events are any undesirable experience associated with the use of a medical product.

  • Examples of adverse events include side effects, allergic reactions, and medication errors.

  • Pharmacovigilance aims to improve pat...read more

Q39. Difference between Append and Merged

Ans.

Append adds rows to a dataset, while Merge combines datasets based on a common key.

  • Append adds rows to the bottom of a dataset, increasing the number of observations.

  • Merge combines datasets based on a common key, such as a unique identifier or variable.

  • Appending is useful for adding new data, while merging is useful for combining related datasets.

  • Example: Appending a new month of sales data to an existing dataset. Merging customer information with sales data based on customer...read more

Q40. Difference between Duplicate & Reference

Ans.

Duplicate refers to an exact copy, while reference is a pointer to the original object.

  • Duplicate is a separate copy of the original data, while reference points to the original data.

  • Changing a duplicate does not affect the original, but changing a reference does.

  • Duplicates consume more memory than references.

  • Example: Duplicate - making a photocopy of a document. Reference - sharing a link to a document.

  • Example: Duplicate - cloning a hard drive. Reference - creating a shortcut...read more

Q41. Difference betweek cross join and cross apply

Ans.

Cross join combines every row from the first table with every row from the second table, while cross apply applies a table-valued function to each row of the first table.

  • Cross join results in a Cartesian product of the two tables.

  • Cross apply is used to invoke a table-valued function for each row of the first table.

  • Cross join does not require a specific condition to join the tables, while cross apply does.

Q42. Difference betweek PowerBI Report and Dashboard

Ans.

PowerBI Report is a collection of visualizations and data organized in a single page, while Dashboard is a single-page display of key metrics and KPIs.

  • PowerBI Report contains multiple pages with different visualizations and data sets.

  • Dashboards are a single-page display of key metrics and KPIs for quick insights.

  • Reports are more detailed and allow for in-depth analysis, while Dashboards provide a high-level overview.

  • Reports are typically used for detailed analysis and sharing...read more

Q43. difference b/w candidate key and compound key

Ans.

Candidate key is a unique key that can uniquely identify each record in a table, while a compound key is a key that consists of multiple columns to uniquely identify each record.

  • Candidate key is a single column key, while compound key is a combination of multiple columns.

  • Candidate key can be a primary key, while compound key cannot be a primary key if it includes non-unique columns.

  • Example: In a table of students, student ID can be a candidate key, while a compound key of stu...read more

Q44. What is ETL

Ans.

ETL stands for Extract, Transform, Load. It is a process used to extract data from various sources, transform it into a consistent format, and load it into a data warehouse for analysis.

  • Extract: Data is extracted from multiple sources such as databases, files, APIs, etc.

  • Transform: Data is cleaned, standardized, and transformed into a consistent format suitable for analysis.

  • Load: The transformed data is loaded into a data warehouse or database for further processing and analys...read more

Q45. What is join

Ans.

Join is a SQL operation used to combine rows from two or more tables based on a related column between them.

  • Join is used to retrieve data from multiple tables based on a related column.

  • Common types of joins include INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.

  • Example: SELECT * FROM table1 INNER JOIN table2 ON table1.column = table2.column;

Q46. What is power Bi

Ans.

Power BI is a business analytics tool by Microsoft that provides interactive visualizations and business intelligence capabilities.

  • Developed by Microsoft

  • Allows users to create interactive visualizations and reports

  • Integrates with various data sources such as Excel, SQL databases, and cloud services

  • Enables data exploration and sharing insights with stakeholders

  • Offers features like dashboards, data connections, and data preparation

Frequently asked in, ,

Q47. What is SQL

Ans.

SQL is a programming language used for managing and manipulating relational databases.

  • SQL stands for Structured Query Language

  • It is used to communicate with databases to perform tasks such as querying data, updating data, and creating tables

  • Common SQL commands include SELECT, INSERT, UPDATE, DELETE

  • Example: SELECT * FROM table_name WHERE condition;

Q48. What are the available data types in sql

Ans.

The available data types in SQL include numeric, character, date/time, and boolean types.

  • Numeric data types include integer, decimal, and floating-point types.

  • Character data types include char, varchar, and text types.

  • Date/time data types include date, time, datetime, and timestamp types.

  • Boolean data type represents true or false values.

Q49. What are the joins available in SQL

Ans.

Joins are used to combine rows from two or more tables based on related columns.

  • INNER JOIN: Returns records that have matching values in both tables.

  • LEFT JOIN: Returns all records from the left table and the matched records from the right table.

  • RIGHT JOIN: Returns all records from the right table and the matched records from the left table.

  • FULL JOIN: Returns all records when there is a match in either left or right table.

  • CROSS JOIN: Returns the Cartesian product of the two ta...read more

Q50. Convert decimal number to binary representation

Ans.

Convert decimal number to binary representation using division and remainder method.

  • Start by dividing the decimal number by 2 and noting down the remainder.

  • Continue dividing the quotient by 2 until the quotient is 0.

  • The remainders obtained in reverse order will give the binary representation.

1
2
3
Next
Interview Tips & Stories
Ace your next interview with expert advice and inspiring stories

Top Interview Questions for Junior Data Analyst Related Skills

Interview experiences of popular companies

3.7
 • 10.3k Interviews
3.7
 • 5.6k Interviews
3.8
 • 5.5k Interviews
3.8
 • 4.8k Interviews
3.3
 • 498 Interviews
2.7
 • 225 Interviews
3.8
 • 84 Interviews
3.8
 • 64 Interviews
3.9
 • 60 Interviews
View all

Calculate your in-hand salary

Confused about how your in-hand salary is calculated? Enter your annual salary (CTC) and get your in-hand salary

Junior Data Analyst Interview Questions
Share an Interview
Stay ahead in your career. Get AmbitionBox app
qr-code
Helping over 1 Crore job seekers every month in choosing their right fit company
65 L+

Reviews

4 L+

Interviews

4 Cr+

Salaries

1 Cr+

Users/Month

Contribute to help millions
Get AmbitionBox app

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2024 Info Edge (India) Ltd.

Follow us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter