Upload Button Icon Add office photos
Engaged Employer

i

This company page is being actively managed by Tech Mahindra Team. If you also belong to the team, you can get access from here

Tech Mahindra Verified Tick

Compare button icon Compare button icon Compare

Filter interviews by

Tech Mahindra Azure Data Engineer Interview Questions and Answers

Updated 6 Jun 2025

12 Interview questions

An Azure Data Engineer was asked 5mo ago
Q. What are the optimization techniques used in Spark?
Ans. 

Optimization techniques in Spark improve performance and efficiency of data processing.

  • Partitioning data to distribute workload evenly

  • Caching frequently accessed data in memory

  • Using broadcast variables for small lookup tables

  • Avoiding shuffling operations whenever possible

  • Tuning configuration settings like memory allocation and parallelism

An Azure Data Engineer was asked 5mo ago
Q. What methods do you use to transfer data from on-premises storage to Azure Data Lake Storage Gen2?
Ans. 

Methods to transfer data from on-premises storage to Azure Data Lake Storage Gen2

  • Use Azure Data Factory to create pipelines for data transfer

  • Utilize Azure Data Box for offline data transfer

  • Leverage Azure Storage Explorer for manual data transfer

  • Implement Azure Data Migration Service for large-scale data migration

Azure Data Engineer Interview Questions Asked at Other Companies

asked in TCS
Q1. How can we load multiple (50) tables at a time using ADF?
Q2. If both ADF and Databricks can achieve similar functionalities li ... read more
asked in KPMG India
Q3. Difference between RDD, Dataframe and Dataset. How and what you h ... read more
asked in Techigai
Q4. What is incremental load and other types of loads? How do you imp ... read more
asked in TCS
Q5. Show me the details of newly joined employees based on two tables ... read more
An Azure Data Engineer was asked 10mo ago
Q. Suppose you have table 1 with values 1, 2, 3, 5, null, null, 0 and table 2 has null, 2, 4, 7, 3, 5. What would be the output after an inner join?
Ans. 

The output after inner join of table 1 and table 2 will be 2,3,5.

  • Inner join only includes rows that have matching values in both tables.

  • Values 2, 3, and 5 are present in both tables, so they will be included in the output.

  • Null values are not considered as matching values in inner join.

An Azure Data Engineer was asked 10mo ago
Q. If both ADF and Databricks can achieve similar functionalities like data transformation, fetching data, and loading the dimension layer, why use Databricks? What is the rationale behind having both?
Ans. 

Databricks enhances data processing with advanced analytics, collaboration, and scalability beyond ADF's capabilities.

  • Databricks provides a collaborative environment for data scientists and engineers to work together using notebooks.

  • It supports advanced analytics and machine learning workflows, which ADF lacks natively.

  • Databricks can handle large-scale data processing with Apache Spark, making it more efficient fo...

What people are saying about Tech Mahindra

View All
a senior engineer
2w
💼 OFFER RECEIVED – Sr. Test Engineer (Band U3) | Tech Mahindra | Noida 📎 Screenshot attached | CTC: ₹13.5 LPA
✅ Variable is paid monthly and fully (as confirmed by HR) ❓ Looking to know the MONTHLY IN-HAND SALARY after standard deductions & partial FBP usage Would appreciate any insights from current/ex-TechM folks! 🙏
FeedCard Image
Got a question about Tech Mahindra?
Ask anonymously on communities.
An Azure Data Engineer was asked 10mo ago
Q. Let's say you have a customers table with customerID and customer name, and an orders table with OrderId and CustomerID. Write a query to find the customer name(s) of the customer(s) who placed the maximum ...
Ans. 

Query to find customer names with the maximum orders from Customers and Orders tables.

  • Use JOIN to combine Customers and Orders tables on CustomerID.

  • Group by CustomerID and count orders to find the maximum.

  • Use a subquery to filter customers with the maximum order count.

  • Example SQL: SELECT c.customerName FROM Customers c JOIN Orders o ON c.customerID = o.CustomerID GROUP BY c.customerID HAVING COUNT(o.OrderId) = (SE...

An Azure Data Engineer was asked 10mo ago
Q. How would you reconstruct a table while preserving historical data, referring to Slowly Changing Dimensions (SCD)?
Ans. 

Use Slowly Changing Dimensions (SCD) to preserve historical data while reconstructing a table.

  • Implement SCD Type 1 for overwriting old data without keeping history.

  • Use SCD Type 2 to create new records for changes, preserving history.

  • Example of SCD Type 2: If a customer's address changes, add a new record with the new address and mark the old record as inactive.

  • SCD Type 3 allows for limited history by adding new co...

An Azure Data Engineer was asked
Q. Write an SQL query to find the highest sales from each city.
Ans. 

Use window functions like ROW_NUMBER() to find highest sales from each city in SQL.

  • Use PARTITION BY clause in ROW_NUMBER() to partition data by city

  • Order the data by sales in descending order

  • Filter the results to only include rows with row number 1

Are these interview questions helpful?
An Azure Data Engineer was asked
Q. In Databricks, how do you mount a storage location?
Ans. 

Databricks can be mounted using the Databricks CLI or the Databricks REST API.

  • Use the Databricks CLI command 'databricks fs mount' to mount a storage account to a Databricks workspace.

  • Alternatively, you can use the Databricks REST API to programmatically mount storage.

An Azure Data Engineer was asked
Q. Describe a scenario where you optimized Spark performance.
Ans. 

Optimizing Spark performance involves tuning configurations, data partitioning, and efficient resource management.

  • Use DataFrame API instead of RDDs for better optimization and performance.

  • Optimize data partitioning by using 'repartition' or 'coalesce' to balance workloads.

  • Leverage broadcast variables to reduce data shuffling in joins.

  • Cache intermediate results using 'persist()' to avoid recomputation.

  • Adjust Spark ...

An Azure Data Engineer was asked 5mo ago
Q. Types of joins and spark queries
Ans. 

Types of joins include inner, outer, left, right, and full joins in Spark queries.

  • Inner join: Returns rows that have matching values in both tables

  • Outer join: Returns all rows when there is a match in one of the tables

  • Left join: Returns all rows from the left table and the matched rows from the right table

  • Right join: Returns all rows from the right table and the matched rows from the left table

  • Full join: Returns r...

Tech Mahindra Azure Data Engineer Interview Experiences

7 interviews found

Interview experience
5
Excellent
Difficulty level
Easy
Process Duration
2-4 weeks
Result
Selected Selected

I appeared for an interview in Jan 2025.

Round 1 - Technical 

(2 Questions)

  • Q1. SQL medium questions to be done in Pyspark parallely.
  • Q2. Datafactory and other azure resources scenario based
Round 2 - Technical 

(1 Question)

  • Q1. Project discussion with the team lead and manager
Round 3 - Client Interview 

(1 Question)

  • Q1. Technical and general discussion
Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via Recruitment Consulltant and was interviewed in Aug 2024. There were 3 interview rounds.

Round 1 - Technical 

(4 Questions)

  • Q1. Lets say you have table 1 with values 1,2,3,5,null,null,0 and table 2 has null,2,4,7,3,5 What would be the output after inner join?
  • Ans. 

    The output after inner join of table 1 and table 2 will be 2,3,5.

    • Inner join only includes rows that have matching values in both tables.

    • Values 2, 3, and 5 are present in both tables, so they will be included in the output.

    • Null values are not considered as matching values in inner join.

  • Answered by AI
  • Q2. Lets say you have customers table with customerID and customer name, Orders table with OrderId and CustomerID. write a query to find the customer name who placed the maximum orders. if more than one person...
  • Ans. 

    Query to find customer names with the maximum orders from Customers and Orders tables.

    • Use JOIN to combine Customers and Orders tables on CustomerID.

    • Group by CustomerID and count orders to find the maximum.

    • Use a subquery to filter customers with the maximum order count.

    • Example SQL: SELECT c.customerName FROM Customers c JOIN Orders o ON c.customerID = o.CustomerID GROUP BY c.customerID HAVING COUNT(o.OrderId) = (SELECT ...

  • Answered by AI
  • Q3. Spark Architecture, Optimisation techniques
  • Q4. Some personal questions.
Round 2 - Technical 

(5 Questions)

  • Q1. Explain the entire architecture of a recent project you are working on in your organisation.
  • Ans. 

    The project involves building a data pipeline to ingest, process, and analyze large volumes of data from various sources in Azure.

    • Utilizing Azure Data Factory for data ingestion and orchestration

    • Implementing Azure Databricks for data processing and transformation

    • Storing processed data in Azure Data Lake Storage

    • Using Azure Synapse Analytics for data warehousing and analytics

    • Leveraging Azure DevOps for CI/CD pipeline aut...

  • Answered by AI
  • Q2. How do you design an effective ADF pipeline and what all metrics and considerations you should keep in mind while designing?
  • Ans. 

    Designing an effective ADF pipeline involves considering various metrics and factors.

    • Understand the data sources and destinations

    • Identify the dependencies between activities

    • Optimize data movement and processing for performance

    • Monitor and track pipeline execution for troubleshooting

    • Consider security and compliance requirements

    • Use parameterization and dynamic content for flexibility

    • Implement error handling and retries fo...

  • Answered by AI
  • Q3. Lets say you have a very huge data volume and in terms of performance how would you slice and dice the data in such a way that you can boost the performance?
  • Ans. 

    Optimize data processing by partitioning, indexing, and using efficient storage formats.

    • Partitioning: Divide large datasets into smaller, manageable chunks. For example, partitioning a sales dataset by year.

    • Indexing: Create indexes on frequently queried columns to speed up data retrieval. For instance, indexing customer IDs in a transaction table.

    • Data Compression: Use compressed formats like Parquet or ORC to reduce st...

  • Answered by AI
  • Q4. Lets say you have to reconstruct a table and we have to preserve the historical data ? ( i couldnt answer that but please refer to SCD)
  • Ans. 

    Use Slowly Changing Dimensions (SCD) to preserve historical data while reconstructing a table.

    • Implement SCD Type 1 for overwriting old data without keeping history.

    • Use SCD Type 2 to create new records for changes, preserving history.

    • Example of SCD Type 2: If a customer's address changes, add a new record with the new address and mark the old record as inactive.

    • SCD Type 3 allows for limited history by adding new columns...

  • Answered by AI
  • Q5. We have adf and databricks both, i can achieve transformation , fetching the data and loading the dimension layer using adf also but why do we use databricks if both have the similar functionality for few ...
  • Ans. 

    Databricks enhances data processing with advanced analytics, collaboration, and scalability beyond ADF's capabilities.

    • Databricks provides a collaborative environment for data scientists and engineers to work together using notebooks.

    • It supports advanced analytics and machine learning workflows, which ADF lacks natively.

    • Databricks can handle large-scale data processing with Apache Spark, making it more efficient for big...

  • Answered by AI
Round 3 - HR 

(1 Question)

  • Q1. Basic HR questions

Interview Preparation Tips

Topics to prepare for Tech Mahindra Azure Data Engineer interview:
  • SQL
  • Databricks
  • Azure Data Factory
  • Pyspark
  • Spark
Interview preparation tips for other job seekers - The interviewers were really nice.

Skills evaluated in this interview

Azure Data Engineer Interview Questions & Answers

user image Niranjan Reddy

posted on 17 Jan 2025

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
2-4 weeks
Result
Selected Selected

I appeared for an interview in Dec 2024.

Round 1 - Technical 

(2 Questions)

  • Q1. What are the optimization techniques used in Spark?
  • Ans. 

    Optimization techniques in Spark improve performance and efficiency of data processing.

    • Partitioning data to distribute workload evenly

    • Caching frequently accessed data in memory

    • Using broadcast variables for small lookup tables

    • Avoiding shuffling operations whenever possible

    • Tuning configuration settings like memory allocation and parallelism

  • Answered by AI
  • Q2. What methods do you use to transfer data from on-premises storage to Azure Data Lake Storage Gen2?
  • Ans. 

    Methods to transfer data from on-premises storage to Azure Data Lake Storage Gen2

    • Use Azure Data Factory to create pipelines for data transfer

    • Utilize Azure Data Box for offline data transfer

    • Leverage Azure Storage Explorer for manual data transfer

    • Implement Azure Data Migration Service for large-scale data migration

  • Answered by AI
Round 2 - Technical 

(1 Question)

  • Q1. Types of joins and spark queries
  • Ans. 

    Types of joins include inner, outer, left, right, and full joins in Spark queries.

    • Inner join: Returns rows that have matching values in both tables

    • Outer join: Returns all rows when there is a match in one of the tables

    • Left join: Returns all rows from the left table and the matched rows from the right table

    • Right join: Returns all rows from the right table and the matched rows from the left table

    • Full join: Returns rows w...

  • Answered by AI
Round 3 - HR 

(1 Question)

  • Q1. What is your willingness to work in a firm office environment?
  • Ans. 

    I am willing to work in a firm office environment.

    • I am comfortable working in a structured office setting

    • I value collaboration and communication with colleagues

    • I am adaptable to different office environments and cultures

  • Answered by AI
Interview experience
4
Good
Difficulty level
Moderate
Process Duration
4-6 weeks
Result
Selected Selected

I appeared for an interview in May 2025, where I was asked the following questions.

  • Q1. Find 2nd highest salary for salary table with dept wise
  • Ans. 

    To find the 2nd highest salary by department, use SQL queries with ranking functions or subqueries.

    • Use the SQL 'ROW_NUMBER()' or 'RANK()' function to assign ranks to salaries within each department.

    • Example SQL query: SELECT dept, salary FROM (SELECT dept, salary, RANK() OVER (PARTITION BY dept ORDER BY salary DESC) as rank FROM salary_table) as ranked WHERE rank = 2;

    • Alternatively, use a subquery to first find the highe...

  • Answered by AI
  • Q2. Give to code for scd type 2 in pyspark
  • Ans. 

    Implementing Slowly Changing Dimension (SCD) Type 2 in PySpark for data versioning.

    • SCD Type 2 tracks historical data by creating new records for changes.

    • Use a DataFrame to represent the current state of the dimension.

    • Identify changes by comparing the incoming data with existing records.

    • Set the 'end_date' of the old record and mark it as inactive.

    • Insert the new record with the updated values and an active status.

  • Answered by AI
Interview experience
4
Good
Difficulty level
Easy
Process Duration
Less than 2 weeks
Result
-

I applied via Naukri.com and was interviewed in May 2024. There were 2 interview rounds.

Round 1 - One-on-one 

(5 Questions)

  • Q1. Project Architecture, spark transformations used?
  • Ans. 

    The project architecture includes Spark transformations for processing large volumes of data.

    • Spark transformations are used to manipulate data in distributed computing environments.

    • Examples of Spark transformations include map, filter, reduceByKey, join, etc.

  • Answered by AI
  • Q2. Advanced SQL questions - highest sales from each city
  • Ans. 

    Use window functions like ROW_NUMBER() to find highest sales from each city in SQL.

    • Use PARTITION BY clause in ROW_NUMBER() to partition data by city

    • Order the data by sales in descending order

    • Filter the results to only include rows with row number 1

  • Answered by AI
  • Q3. Data modelling - Star schema, Snowflake schema, Dimension and Fact tables
  • Q4. Databricks - how to mount?
  • Ans. 

    Databricks can be mounted using the Databricks CLI or the Databricks REST API.

    • Use the Databricks CLI command 'databricks fs mount' to mount a storage account to a Databricks workspace.

    • Alternatively, you can use the Databricks REST API to programmatically mount storage.

  • Answered by AI
  • Q5. Questions on ADF - pipeline used in the project
Round 2 - One-on-one 

(1 Question)

  • Q1. Questions on Databricks - optimizations, history, autoloader, liquid clustering, autoscaling

Skills evaluated in this interview

Interview experience
2
Poor
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Coding Test 

It was a SQL-related question that required you to solve the problem.

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare thoroughly for the interview to ensure that they cannot identify any mistakes that could lead to your rejection.
Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
-
Result
Not Selected
Round 1 - Technical 

(1 Question)

  • Q1. What is incremental load. What is partition and bucketing. Spark archtecture
  • Ans. 

    Incremental load is the process of loading only new or updated data into a data warehouse, rather than reloading all data each time.

    • Incremental load helps in reducing the time and resources required for data processing.

    • It involves identifying new or updated data since the last load and merging it with the existing data.

    • Common techniques for incremental load include using timestamps or change data capture (CDC) mechanis...

  • Answered by AI
Round 2 - Technical 

(1 Question)

  • Q1. Scenario based question on spark performance optimization.
  • Ans. 

    Optimizing Spark performance involves tuning configurations, data partitioning, and efficient resource management.

    • Use DataFrame API instead of RDDs for better optimization and performance.

    • Optimize data partitioning by using 'repartition' or 'coalesce' to balance workloads.

    • Leverage broadcast variables to reduce data shuffling in joins.

    • Cache intermediate results using 'persist()' to avoid recomputation.

    • Adjust Spark confi...

  • Answered by AI

Skills evaluated in this interview

Interview questions from similar companies

I applied via Naukri.com and was interviewed before Aug 2020. There were 4 interview rounds.

Interview Questionnaire 

1 Question

  • Q1. Technical questions : 1)oops concepts 2)plsql cursors, triggers, procedures 3)quick sort algorithm

Interview Preparation Tips

Interview preparation tips for other job seekers - Be prepared with your resume. None of the questions were asked out of resume.

Interview Questionnaire 

2 Questions

  • Q1. Apigee
  • Q2. Interal architecture

Interview Questionnaire 

1 Question

  • Q1. Where do you see yourself in 5 years.

Tech Mahindra Interview FAQs

How many rounds are there in Tech Mahindra Azure Data Engineer interview?
Tech Mahindra interview process usually has 2-3 rounds. The most common rounds in the Tech Mahindra interview process are Technical, One-on-one Round and HR.
How to prepare for Tech Mahindra Azure Data Engineer interview?
Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at Tech Mahindra. The most common topics and skills that interviewers at Tech Mahindra expect are SQL, Python, Azure, Pyspark and Spark.
What are the top questions asked in Tech Mahindra Azure Data Engineer interview?

Some of the top questions asked at the Tech Mahindra Azure Data Engineer interview -

  1. we have adf and databricks both, i can achieve transformation , fetching the da...read more
  2. How do you design an effective ADF pipeline and what all metrics and considerat...read more
  3. Lets say you have customers table with customerID and customer name, Orders tab...read more

Tell us how to improve this page.

Overall Interview Experience Rating

4.1/5

based on 8 interview experiences

Difficulty level

Easy 33%
Moderate 67%

Duration

Less than 2 weeks 40%
2-4 weeks 40%
4-6 weeks 20%
View more
Tech Mahindra Azure Data Engineer Salary
based on 288 salaries
₹4.8 L/yr - ₹10.5 L/yr
17% less than the average Azure Data Engineer Salary in India
View more details

Tech Mahindra Azure Data Engineer Reviews and Ratings

based on 14 reviews

3.5/5

Rating in categories

3.5

Skill development

3.6

Work-life balance

3.0

Salary

3.3

Job security

3.8

Company culture

2.5

Promotions

3.3

Work satisfaction

Explore 14 Reviews and Ratings
Software Engineer
26.6k salaries
unlock blur

₹3.7 L/yr - ₹9.2 L/yr

Senior Software Engineer
22.2k salaries
unlock blur

₹9.1 L/yr - ₹18.5 L/yr

Technical Lead
12.5k salaries
unlock blur

₹16.9 L/yr - ₹30 L/yr

Associate Software Engineer
6.1k salaries
unlock blur

₹1.9 L/yr - ₹5.7 L/yr

Team Lead
5.4k salaries
unlock blur

₹5.7 L/yr - ₹17.7 L/yr

Explore more salaries
Compare Tech Mahindra with

Infosys

3.6
Compare

Cognizant

3.7
Compare

Accenture

3.7
Compare

Wipro

3.7
Compare
write
Share an Interview