Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

VIEW WINNERS
- ABECA 2025
  
  VIEW WINNERS
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
Participate in ABECA 2026

Add office photos

Employer? Claim Account for FREE

Accenture

Compare

3.7

based on 64.5k Reviews

Video summary

Filter interviews by

Accenture Data Engineer Interview Questions and Answers

Updated 11 Jul 2025

58 Interview questions

A Data Engineer was asked 2mo ago

Q. What are materialized views?

Ans.

Materialized views are database objects that store the result of a query for faster access and improved performance.

Materialized views store data physically, unlike regular views which are virtual.
They can improve query performance by pre-computing expensive joins and aggregations.
Materialized views can be refreshed automatically or manually to keep data up-to-date.
Example: A materialized view can aggregate sales ...

A Data Engineer was asked 6mo ago

Q. What optimization techniques can be used to improve the performance of Databricks?

Ans.

Optimisation techniques for improving Databricks performance

Utilize cluster sizing and autoscaling to match workload demands
Optimize data storage formats like Parquet for efficient querying
Use partitioning and indexing to speed up data retrieval
Leverage caching for frequently accessed data
Monitor and tune query performance using Databricks SQL Analytics
Consider using Delta Lake for ACID transactions and improved p...

A Data Engineer was asked 6mo ago

Q. How do you implement an alerting mechanism in ADF for failed pipelines?

Ans.

Alerting mechanism in ADF for failed pipelines involves setting up alerts in Azure Monitor and configuring email notifications.

Set up alerts in Azure Monitor for monitoring pipeline runs
Configure alert rules to trigger notifications when a pipeline run fails
Use Azure Logic Apps to send email notifications for failed pipeline runs

A Data Engineer was asked 6mo ago

Q. What is Unity Catalog?

Ans.

Unity catalog is a centralized repository of assets, scripts, and plugins for Unity game development.

Unity catalog is used by developers to easily access and integrate assets into their Unity projects.
It includes a wide range of resources such as 3D models, textures, animations, and scripts.
Developers can search, preview, and download assets from the Unity catalog.
Unity catalog helps streamline the game developmen...

What people are saying about Accenture

View All

a team lead

Regretting my offer negotiation

Hello Peeps, I need your honest opinions. I am working as a Team lead for SAP BASIS with 8.5 YOE. Cracked an interview at a fairly good product based company. My ctc is 16.5 F+2.5 VP. I demanded 25 lpa fixed considering 50% hike on my current Fixed. Now one month before joining my colleagues have been saying that I should have negotiated definitely more than this And that I made a big mistake. I do have a similar offer but I do not want to counter offer since I have made a professional commitment to them. Thanks 😊

Got a question about Accenture?

Ask anonymously on communities.

A Data Engineer was asked 6mo ago

Q. What is Autoloader?

Ans.

Autoloader is a tool or feature that automatically loads data into a system without manual intervention.

Autoloader eliminates the need for manual data loading processes.
It can be used in data warehouses, databases, or ETL pipelines.
Examples include Amazon Redshift's COPY command for bulk data loading.

A Data Engineer was asked 7mo ago

Q. What is the difference between persist and cache?

Ans.

Persist stores the data in memory and disk, while cache only stores in memory.

Persist stores the data both in memory and disk for fault tolerance and recovery.
Cache only stores the data in memory for faster access.
Persist is used when the data needs to be recovered in case of failure, while cache is used for temporary storage.
Example: persist() in Spark RDD saves data to disk, while cache() stores data in memory f...

A Data Engineer was asked 7mo ago

Q. What is an Accumulator?

Ans.

An accumulator is a variable used in distributed computing to aggregate values across multiple tasks or nodes.

Accumulators are used in Spark to perform calculations in a distributed manner.
They are read-only variables that can only be updated by an associative and commutative operation.
Accumulators are used for tasks like counting elements or summing values in parallel processing.
Example: counting the number of er...

Are these interview questions helpful?

🔥 Asked by recruiter 3 times

A Data Engineer was asked 7mo ago

Q. Explain your project.

Ans.

Developed a data pipeline to process and analyze large datasets for real-time insights in a retail environment.

Designed ETL processes using Apache Airflow to automate data extraction from various sources.
Utilized AWS services like S3 for storage and Redshift for data warehousing.
Implemented data quality checks to ensure accuracy and reliability of the data.
Created dashboards using Tableau for visualizing sales tre...

A Data Engineer was asked 8mo ago

Q. What are triggers and their types in ADF?

Ans.

Triggers in Azure Data Factory (ADF) are events that cause a pipeline to execute.

Types of triggers in ADF include schedule, tumbling window, event-based, and manual.
Schedule triggers run pipelines on a specified schedule, like daily or hourly.
Tumbling window triggers run pipelines at specified time intervals.
Event-based triggers execute pipelines based on events like file arrival or HTTP request.
Manual triggers re...

A Data Engineer was asked 8mo ago

Q. How does integration runtime work in ADF?

Ans.

Integration run time in Azure Data Factory (ADF) refers to the time taken for data integration processes to run.

Integration run time can vary based on the complexity of the data integration tasks and the volume of data being processed.
Factors such as network latency, data source location, and the number of parallel activities can also impact integration run time.
Monitoring and optimizing integration run time is im...

Accenture Data Engineer Interview Experiences

81 interviews found

Data Engineer Interview Questions & Answers

Anonymous

posted on 3 Jan 2025

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Not Selected

I applied via Naukri.com and was interviewed in Dec 2024. There was 1 interview round.

Round 1 - Technical

(5 Questions)

Q1. Scenario based questions on Azure data factory and pipelines

Add your answer

Q2. Optimisation technic to improve the performance of databricks

Ans.

Optimisation techniques for improving Databricks performance

Utilize cluster sizing and autoscaling to match workload demands
Optimize data storage formats like Parquet for efficient querying
Use partitioning and indexing to speed up data retrieval
Leverage caching for frequently accessed data
Monitor and tune query performance using Databricks SQL Analytics
Consider using Delta Lake for ACID transactions and improved perfor...

Answered by AI

Add your answer

Q3. What is Autoloader

Add your answer

Q4. What is unity catalog

Ans.

Unity catalog is a centralized repository of assets, scripts, and plugins for Unity game development.

Unity catalog is used by developers to easily access and integrate assets into their Unity projects.
It includes a wide range of resources such as 3D models, textures, animations, and scripts.
Developers can search, preview, and download assets from the Unity catalog.
Unity catalog helps streamline the game development pro...

Answered by AI

Add your answer

Q5. How you do the alerting mechanism in adf for failed pipelines

Ans.

Alerting mechanism in ADF for failed pipelines involves setting up alerts in Azure Monitor and configuring email notifications.

Set up alerts in Azure Monitor for monitoring pipeline runs
Configure alert rules to trigger notifications when a pipeline run fails
Use Azure Logic Apps to send email notifications for failed pipeline runs

Answered by AI

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 17 Jul 2024

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

I applied via Recruitment Consulltant and was interviewed in Jun 2024. There was 1 interview round.

Round 1 - One-on-one

(20 Questions)

Q1. Tell me about yourself

Add your answer

Q2. Project Architecture

Add your answer

Q3. Rate yourself out of 5 in Pyspark , Python and SQL

Add your answer

Q4. How to handle duplicates in python ?

Ans.

Use Python's built-in data structures like sets or dictionaries to handle duplicates.

Use a set to remove duplicates from a list: unique_list = list(set(original_list))
Use a dictionary to remove duplicates from a list while preserving order: unique_list = list(dict.fromkeys(original_list))

Answered by AI

Add your answer

Q5. Methods of migrating Hive metdatastore to unity catalog in Databricks ?

Ans.

Use Databricks provided tools like databricks-connect and databricks-cli to migrate Hive metadata to Unity catalog.

Use databricks-connect to connect to the Databricks workspace from your local development environment.
Use databricks-cli to export the Hive metadata from the existing Hive metastore.
Create a new Unity catalog in Databricks and import the exported metadata using databricks-cli.
Validate the migration by chec...

Answered by AI

Add your answer

Q6. Read a CSV file from ADLS path ?

Ans.

To read a CSV file from an ADLS path, you can use libraries like pandas or pyspark.

Use pandas library in Python to read a CSV file from ADLS path
Use pyspark library in Python to read a CSV file from ADLS path
Ensure you have the necessary permissions to access the ADLS path

Answered by AI

Add your answer

Q7. There was a table provided on coding screen and asked to write different programs and SQL queries from the table and tell the approach you are taking ? Like age greater than 30 then sum the age how would y...

Add your answer

Q8. How many stages will create from the above code that I have written

Add your answer

Q9. Narrow vs Wide Transformation ?

Ans.

Narrow transformation processes one record at a time, while wide transformation processes multiple records at once.

Narrow transformation processes one record at a time, making it easier to parallelize and optimize.
Wide transformation processes multiple records at once, which can lead to shuffling and performance issues.
Examples of narrow transformations include map and filter operations, while examples of wide transfor...

Answered by AI

Add your answer

Q10. What are action and transformation ?

Ans.

Actions and transformations are key concepts in data engineering, involving the manipulation and processing of data.

Actions are operations that trigger the execution of a data transformation job in a distributed computing environment.
Transformations are functions that take an input dataset and produce an output dataset, often involving filtering, aggregating, or joining data.
Examples of actions include 'saveAsTextFile'...

Answered by AI

Add your answer

Q11. What happens when we enforce the schema and when we manually define the schema in the code ?

Ans.

Enforcing the schema ensures data consistency and validation, while manually defining the schema in code allows for more flexibility and customization.

Enforcing the schema ensures that all data conforms to a predefined structure and format, preventing errors and inconsistencies.
Manually defining the schema in code allows for more flexibility in handling different data types and structures.
Enforcing the schema can be do...

Answered by AI

Add your answer

Q12. What all the optimisation are possible to reduce the overhead of reducing the reading part of large datasets in spark ?

Ans.

Optimizations like partitioning, caching, and using efficient file formats can reduce overhead in reading large datasets in Spark.

Partitioning data based on key can reduce the amount of data shuffled during joins and aggregations
Caching frequently accessed datasets in memory can avoid recomputation
Using efficient file formats like Parquet or ORC can reduce disk I/O and improve read performance

Answered by AI

View 1 more answer

Q13. Write a sql query to find the name of person who logged in last within each country from Person Table ?

Ans.

SQL query to find the name of person who logged in last within each country from Person Table

Use a subquery to find the max login time for each country
Join the Person table with the subquery on country and login time to get the name of the person

Answered by AI

Add your answer

Q14. Difference between List and Tuple ?

Add your answer

Q15. Difference between Rank , Dense Rank and Row Number and when we are using each of them ?

Ans.

Rank assigns a unique rank to each row, Dense Rank assigns a unique rank to each distinct row, and Row Number assigns a unique number to each row.

Rank assigns the same rank to rows with the same value, leaving gaps in the ranking if there are ties.
Dense Rank assigns a unique rank to each distinct row, leaving no gaps in the ranking.
Row Number assigns a unique number to each row, without any regard for the values in the...

Answered by AI

Add your answer

Q16. What is List Comprehension ?

View 1 more answer

Q17. Tell me about the performance optimization done in your project ?

Ans.

Optimized data processing and storage to enhance performance and reduce latency in ETL workflows.

Implemented partitioning in our data warehouse to improve query performance, reducing data scan times by 40%.
Utilized indexing on frequently queried columns, leading to a 30% decrease in query execution time.
Migrated from a traditional RDBMS to a distributed NoSQL database, which improved scalability and read/write speeds.
O...

Answered by AI

Add your answer

Q18. Difference between the interactive cluster and job cluster ?

Ans.

Interactive clusters allow for real-time interaction and exploration, while job clusters are used for running batch jobs.

Interactive clusters are used for real-time data exploration and analysis.
Job clusters are used for running batch jobs and processing large amounts of data.
Interactive clusters are typically smaller in size and have shorter lifespans.
Job clusters are usually larger and more powerful to handle heavy w...

Answered by AI

Add your answer

Q19. How to add a column in dataframe ? How to rename the column in dataframe ?

Ans.

To add a column in a dataframe, use the 'withColumn' method. To rename a column, use the 'withColumnRenamed' method.

To add a column, use the 'withColumn' method with the new column name and the expression to compute the values for that column.
Example: df.withColumn('new_column', df['existing_column'] * 2)
To rename a column, use the 'withColumnRenamed' method with the current column name and the new column name.
Example:...

Answered by AI

Add your answer

Q20. Difference between Coalesce and Repartition and In which case we are using it ?

Ans.

Coalesce is used to combine multiple small partitions into a larger one, while Repartition is used to increase or decrease the number of partitions in a DataFrame.

Coalesce reduces the number of partitions in a DataFrame by combining small partitions into larger ones.
Repartition increases or decreases the number of partitions in a DataFrame by shuffling the data across partitions.
Coalesce is more efficient than Repartit...

Answered by AI

View 1 more answer

Interview Preparation Tips

Topics to prepare for Accenture Data Engineer interview:

Spark
Databricks
SQL
Python
ETL

Interview preparation tips for other job seekers - Focus on Basics , definitions and Understand the spark internals . Write SQL codes efficiently.

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

Anonymous

posted on 15 Oct 2024

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

2-4 weeks

Result

Selected

I applied via Company Website and was interviewed in Sep 2024. There was 1 interview round.

Round 1 - One-on-one

(5 Questions)

Q1. Union vs union all

Ans.

Union combines and removes duplicates, while union all combines all rows including duplicates.

Union removes duplicates from the result set
Union all includes all rows, even duplicates
Use union when you want to remove duplicates, use union all when duplicates are needed

Answered by AI

Add your answer

Q2. Rank vs dense rank

Add your answer

Q3. Facts vs dimensions table

Ans.

Facts tables contain numerical data while dimensions tables contain descriptive attributes.

Facts tables store quantitative data like sales revenue or quantity sold
Dimensions tables store descriptive attributes like product name or customer details
Facts tables are typically used for analysis and reporting, while dimensions tables provide context for the facts

Answered by AI

Add your answer

Q4. Basics of Databricks

Add your answer

Q5. Lambda in python

Ans.

Lambda functions in Python are anonymous functions that can have any number of arguments but only one expression.

Lambda functions are defined using the lambda keyword.
They are commonly used for small, one-time tasks.
Lambda functions can be used as arguments to higher-order functions like map, filter, and reduce.

Answered by AI

Add your answer

Interview Preparation Tips

Topics to prepare for Accenture Data Engineer interview:

Azure Databricks
Advanced sql
Python
Pyspark

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

Anonymous

posted on 13 Dec 2024

Interview experience

Bad

Difficulty level

Process Duration

Result

Round 1 - Technical

(1 Question)

Q1. GCP Data Engineer Concepts. Please don't waste your time giving interviews at accenture.

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - I recently had an interview with Accenture, and although I am proficient in interviews, I feel this company is wasting our time. They seem to conduct interviews merely for appearances. I had two offers in hand that I did not disclose, and I am aware of my technical abilities. However, this company wasted my time by conducting interviews and then rejecting candidates. I want to highlight this to fellow job seekers: please do not waste your time.

Data Engineer Interview Questions & Answers

Anonymous

posted on 22 Nov 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

I applied via Approached by Company and was interviewed in Oct 2024. There were 2 interview rounds.

Round 1 - Coding Test

It was 60 mins test where there were 11 MCQ 3 SQL and 1 python questions

Round 2 - Technical

(2 Questions)

Q1. End to end databricks code to read the multiple files from adls and writing it into a single file

Ans.

Use Databricks code to read multiple files from ADLS and write into a single file

Use Databricks File System (DBFS) to access files in ADLS
Read multiple files using Spark's read method
Combine the dataframes using union or merge
Write the combined dataframe to a single file using Spark's write method

Answered by AI

Add your answer

Q2. Pyspark architecture

Add your answer

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

Dipak Rout

posted on 21 Nov 2024

Interview experience

Excellent

Difficulty level

Process Duration

Result

Round 1 - Technical

(2 Questions)

Q1. Describe about your project

Add your answer

Q2. Describe about spark architecture

Ans.

Spark architecture is a distributed computing framework that provides high-level APIs for various languages.

Spark architecture consists of a cluster manager, worker nodes, and a driver program.
It uses Resilient Distributed Datasets (RDDs) for fault-tolerant distributed data processing.
Spark applications run as independent sets of processes on a cluster, coordinated by the SparkContext object.
It supports various data so...

Answered by AI

Add your answer

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

Anonymous

posted on 16 May 2025

Interview experience

Poor

Difficulty level

Moderate

Process Duration

2-4 weeks

Result

No response

I appeared for an interview in Apr 2025, where I was asked the following questions.

Q1. What are materialized views?

Ans.

Materialized views are database objects that store the result of a query for faster access and improved performance.

Materialized views store data physically, unlike regular views which are virtual.
They can improve query performance by pre-computing expensive joins and aggregations.
Materialized views can be refreshed automatically or manually to keep data up-to-date.
Example: A materialized view can aggregate sales data ...

Answered by AI

Add your answer

Q2. Time travel, Failsafe , Snowpipe, zerocopy cloning

Add your answer

Q3. Data sharing and dynamic data masking policies

Add your answer

Q4. Snowflake Architecture, editions, role based access control, streams , copy into command

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 8 Oct 2024

Interview experience

Average

Difficulty level

Easy

Process Duration

Less than 2 weeks

Result

No response

I applied via Naukri.com and was interviewed in Sep 2024. There was 1 interview round.

Round 1 - Technical

(2 Questions)

Q1. Python- remove duplicate from set

Ans.

Use set() function to remove duplicates from a list in Python.

Convert the list to a set using set() function
Convert the set back to a list to remove duplicates
Example: list_with_duplicates = ['a', 'b', 'a', 'c']; list_without_duplicates = list(set(list_with_duplicates))

Answered by AI

Add your answer

Q2. Pyspark- add column with default value

Ans.

In PySpark, you can add a column with a default value using the withColumn method and lit function.

Use the withColumn method to add a new column to a DataFrame.
Utilize the lit function from pyspark.sql.functions to set a default value.
Example: df = df.withColumn('new_column', lit('default_value')).
This will add 'new_column' with 'default_value' for all rows in the DataFrame.

Answered by AI

Add your answer

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

Anonymous

posted on 3 Jan 2025

Interview experience

Good

Difficulty level

Process Duration

Result

No response

Round 1 - Technical

(1 Question)

Q1. SQL Based Questions

Add your answer

Round 2 - Technical

(1 Question)

Q1. Spark Question like repartitioning

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 10 Oct 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Not Selected

I applied via Job Portal and was interviewed in Sep 2024. There was 1 interview round.

Round 1 - Technical

(2 Questions)

Q1. Scala, ADB, ADF, Synapse

Add your answer

Q2. Concepts ina detailed way should be analysed.

Add your answer

Accenture Interview FAQs

How many rounds are there in Accenture Data Engineer interview?

Accenture interview process usually has 1-2 rounds. The most common rounds in the Accenture interview process are Technical, HR and One-on-one Round.

How to prepare for Accenture Data Engineer interview?

Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at Accenture. The most common topics and skills that interviewers at Accenture expect are SQL, Data Warehousing, Data Quality, Python and Data Modeling.

What are the top questions asked in Accenture Data Engineer interview?

Some of the top questions asked at the Accenture Data Engineer interview -

What all the optimisation are possible to reduce the overhead of reducing the r...read more
Write a sql query to find the name of person who logged in last within each cou...read more
Difference between Coalesce and Repartition and In which case we are using i...read more

How long is the Accenture Data Engineer interview process?

The duration of Accenture Data Engineer interview process can vary, but typically it takes about less than 2 weeks to complete.

Tell us how to improve this page.

Accenture Interviews By Designations

Interview Questions for Popular Designations

4.1/5

based on 88 interview experiences

Difficulty level

Easy 24%

Moderate 76%

Duration

Less than 2 weeks 69%

2-4 weeks 28%

6-8 weeks 3%

Top Skills for Accenture Data Engineer

Big Data Interview Questions & Answers

250 Questions

SQL Interview Questions & Answers

250 Questions

Python Interview Questions & Answers

200 Questions

Data Processing Interview Questions & Answers

100 Questions

Data Engineering Interview Questions & Answers

100 Questions

Spark Interview Questions & Answers

50 Questions

TCS Data Engineer Interview Questions

3.6

• 98 Interviews

IBM Data Engineer Interview Questions

3.9

• 41 Interviews

Capgemini Data Engineer Interview Questions

3.7

• 37 Interviews

Cognizant Data Engineer Interview Questions

3.7

• 33 Interviews

Infosys Data Engineer Interview Questions

3.6

• 31 Interviews

Wipro Data Engineer Interview Questions

3.7

• 26 Interviews

Tech Mahindra Data Engineer Interview Questions

3.5

• 16 Interviews

HCLTech Data Engineer Interview Questions

3.5

• 13 Interviews

Genpact Data Engineer Interview Questions

3.7

• 9 Interviews

DXC Technology Data Engineer Interview Questions

3.6

• 7 Interviews

View all

Accenture Data Engineer Salary

based on 3k salaries

₹4.7 L/yr - ₹17.5 L/yr

16% less than the average Data Engineer Salary in India

View more details

Data Engineer Jobs at Accenture

Data Engineer

Bangalore / Bengaluru

15-20 Yrs

Not Disclosed

Data Engineer

Bangalore / Bengaluru

15-20 Yrs

Not Disclosed

Data Engineer

Hyderabad / Secunderabad

3-8 Yrs

₹ 4.5-15 LPA

Explore more jobs

Accenture Salaries in India

Application Development Analyst 39.4k salaries	₹4.8 L/yr - ₹11 L/yr
Application Development - Senior Analyst 27.7k salaries	₹8.2 L/yr - ₹16.1 L/yr
Team Lead 26.9k salaries	₹12.7 L/yr - ₹22.5 L/yr
Senior Analyst 19.9k salaries	₹9.1 L/yr - ₹15.7 L/yr
Senior Software Engineer 18.6k salaries	₹10.4 L/yr - ₹18 L/yr