Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

VIEW WINNERS
- ABECA 2025
  
  VIEW WINNERS
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
Participate in ABECA 2026

Add office photos

Employer? Claim Account for FREE

Accenture

Compare

3.7

based on 65.2k Reviews

Video summary

Filter interviews by

Accenture Data Engineering Analyst Interview Questions and Answers

Updated 7 Aug 2025

19 Interview questions

A Data Engineering Analyst was asked 4d ago

Q. What are parquet files?

Ans.

Parquet files are columnar storage files optimized for big data processing and analytics.

Columnar storage format, allowing efficient data compression and encoding.
Designed for use with big data processing frameworks like Apache Hadoop and Apache Spark.
Supports complex nested data structures, making it suitable for various data types.
Parquet files can significantly reduce storage costs and improve query performance...

A Data Engineering Analyst was asked 4d ago

Q. What are Delta Live Tables?

Ans.

Delta Live Tables are a framework for building reliable data pipelines in Databricks, enabling real-time data processing.

Delta Live Tables simplify ETL processes by automating data pipeline management.
They support incremental data processing, allowing for real-time updates.
Users can define data transformations using SQL or Python, making it accessible.
Example: A retail company can use Delta Live Tables to continuo...

A Data Engineering Analyst was asked 10mo ago

Q. Can you explain your academic projects?

Ans.

Developed a data analysis tool to predict customer churn using machine learning algorithms.

Used Python for data preprocessing and model building
Implemented logistic regression and random forest algorithms
Evaluated model performance using metrics like accuracy, precision, and recall

A Data Engineering Analyst was asked

Q. Suppose there is a file with 100 columns, and you only want to load 10 specific columns. How would you approach this?

Ans.

To load specific columns from a file, use data processing tools to filter the required columns efficiently.

Use libraries like Pandas in Python: `df = pd.read_csv('file.csv', usecols=['col1', 'col2', ...])`.
In SQL, you can specify columns in your SELECT statement: `SELECT col1, col2 FROM table_name;`.
For CSV files, tools like awk can be used: `awk -F, '{print $1,$2,...}' file.csv`.
In ETL processes, configure the ex...

What people are saying about Accenture

View All

a cyber security analyst

Disappointed with the Candidate Experience at Accenture

I recently interviewed at Accenture for a Security Architect role. I cleared two rounds of technical interviews and was later told by the HR (verbally) that I was selected and that my offer letter would be released soon. Based on that confirmation, I submitted all required documents and waited patiently for nearly a month. Despite following up multiple times, there was no proper communication or update. My application status remained “Active” on the portal the entire time. Eventually, I received a rejection email with no explanation, feedback, or context, despite being verbally told that I was selected. it’s disappointing when a candidate is given verbal assurance and kept waiting without clarity. I expected more transparent from a company of Accenture’s reputation. If anyone from Accenture here could help me understand what might have happened or possibly refer me for any similar openings in the Security Architect / Data Encryption/ key management, I’d be genuinely grateful.

Got a question about Accenture?

Ask anonymously on communities.

A Data Engineering Analyst was asked

Q. Given a list of strings, how would you determine the frequency of each unique string value? For example, given the input ['a', 'a', 'a', 'b', 'b', 'c'], the expected output is a:3, b:2, c:1.

Ans.

Calculate the frequency of each unique string in an array and display the results.

Use a dictionary to count occurrences: {'a': 3, 'b': 2, 'c': 1}.
Iterate through the list and update counts for each character.
Example: For input ['a', 'a', 'b'], output should be 'a,2' and 'b,1'.
Utilize collections.Counter for a more concise solution.

A Data Engineering Analyst was asked

Q. What are case classes in Python?

Ans.

Case classes in Python are classes that are used to create immutable objects for pattern matching and data modeling.

Case classes are typically used in functional programming to represent data structures.
They are immutable, meaning their values cannot be changed once they are created.
Case classes automatically define equality, hash code, and toString methods based on the class constructor arguments.
They are commonl...

A Data Engineering Analyst was asked

Q. Given an Employee table with columns Employee name, Salary, and Department, write a PySpark query to find the name of the employee with the second highest salary in each department.

Ans.

Find the 2nd highest salary employee in each department using PySpark.

Read the CSV file into a DataFrame using spark.read.csv().
Group the DataFrame by 'Department' and use the 'dense_rank()' function to rank salaries.
Filter the DataFrame to get employees with a rank of 2.
Select the 'Employee name' and 'Department' columns for the final output.

Are these interview questions helpful?

A Data Engineering Analyst was asked

Q. Suppose you are adding a block that takes a significant amount of time. How would you start debugging it?

Ans.

To debug a slow block, start by identifying potential bottlenecks, analyzing logs, checking for errors, and profiling the code.

Identify potential bottlenecks in the code or system that could be causing the slow performance.
Analyze logs and error messages to pinpoint any issues or exceptions that may be occurring.
Use profiling tools to analyze the performance of the code and identify areas that need optimization.
Ch...

A Data Engineering Analyst was asked

Q. You have 200 Petabytes of data to load. How will you decide the number of executors required, considering the data is out of cache?

Ans.

The number of executors required to load 200 Petabytes of data depends on the size of each executor and the available cache.

Calculate the size of each executor based on available resources and data size
Consider the amount of cache available for data processing
Determine the optimal number of executors based on the above factors

A Data Engineering Analyst was asked

Q. Define RDD Lineage and its process.

Ans.

RDD Lineage is the record of transformations applied to an RDD and the dependencies between RDDs.

RDD Lineage tracks the sequence of transformations applied to an RDD from its source data.
It helps in fault tolerance by allowing RDDs to be reconstructed in case of data loss.
RDD Lineage is used in Spark to optimize the execution plan by eliminating unnecessary computations.
Example: If an RDD is created from a text fi...

Accenture Data Engineering Analyst Interview Experiences

14 interviews found

Data Engineering Analyst Interview Questions & Answers

Anonymous

posted on 10 Nov 2024

Interview experience

Average

Difficulty level

Process Duration

Result

Round 1 - Coding Test

Sql, pyhton, azure databricks, azure data factory

Data Engineering Analyst Interview Questions & Answers

Anonymous

posted on 26 Sep 2023

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

I applied via Referral and was interviewed in Aug 2023. There were 2 interview rounds.

Round 1 - Resume Shortlist

Pro Tip by AmbitionBox:

Keep your resume crisp and to the point. A recruiter looks at your resume for an average of 6 seconds, make sure to leave the best impression.

View all tips

Round 2 - Technical

(15 Questions)

Q1. Introduce your self and Explain Your Project and your Role?

Add your answer

Q2. Explain Airflow with its Internal Architecture?

View 2 more answers

Q3. What is RDD in Spark?

Ans.

RDD stands for Resilient Distributed Dataset in Spark, which is an immutable distributed collection of objects.

RDD is the fundamental data structure in Spark, representing a collection of elements that can be operated on in parallel.
RDDs are fault-tolerant, meaning they can automatically recover from failures.
RDDs support two types of operations: transformations (creating a new RDD from an existing one) and actions (tr...

Answered by AI

Add your answer

Q4. Define RDD Lineage and its Process

Ans.

RDD Lineage is the record of transformations applied to an RDD and the dependencies between RDDs.

RDD Lineage tracks the sequence of transformations applied to an RDD from its source data.
It helps in fault tolerance by allowing RDDs to be reconstructed in case of data loss.
RDD Lineage is used in Spark to optimize the execution plan by eliminating unnecessary computations.
Example: If an RDD is created from a text file an...

Answered by AI

Add your answer

Q5. What do you mean by broadcast Variables?

Ans.

Broadcast Variables are read-only shared variables that are cached on each machine in a Spark cluster rather than being sent with tasks.

Broadcast Variables are used to efficiently distribute large read-only datasets to all worker nodes in a Spark cluster.
They are useful for tasks that require the same data to be shared across multiple stages of a job.
Broadcast Variables are created using the broadcast() method in Spark...

Answered by AI

Add your answer

Q6. What is Broadcasting are you using Broadcasting and what is the limitation of broadcasting?

Ans.

Broadcasting is a technique used in Apache Spark to optimize data transfer by sending smaller data to all nodes in a cluster.

Broadcasting is used to efficiently distribute read-only data to all nodes in a cluster to avoid unnecessary data shuffling.
It is commonly used when joining large datasets with smaller lookup tables.
Broadcast variables are cached in memory and reused across multiple stages of a Spark job.
The limi...

Answered by AI

Add your answer

Q7. Are you using acumulator and Explain cathelyst optimizer

Ans.

Accumulators are used for aggregating values across tasks, while Catalyst optimizer is a query optimizer for Apache Spark.

Accumulators are variables that are only added to through an associative and commutative operation and can be used to implement counters or sums.
Catalyst optimizer is a rule-based query optimizer that leverages advanced programming language features to build an extensible query optimizer.
Catalyst op...

Answered by AI

Add your answer

Q8. Suppose you adding a block and that takes much time you have to debug it how you start the debug ?

Ans.

To debug a slow block, start by identifying potential bottlenecks, analyzing logs, checking for errors, and profiling the code.

Identify potential bottlenecks in the code or system that could be causing the slow performance.
Analyze logs and error messages to pinpoint any issues or exceptions that may be occurring.
Use profiling tools to analyze the performance of the code and identify areas that need optimization.
Check f...

Answered by AI

Add your answer

Q9. You have to 200 Petabyte of data to load how you will decide the number of executor required ?out of cache you have

Ans.

The number of executors required to load 200 Petabytes of data depends on the size of each executor and the available cache.

Calculate the size of each executor based on available resources and data size
Consider the amount of cache available for data processing
Determine the optimal number of executors based on the above factors

Answered by AI

Add your answer

Q10. What is prepartition ?

Add your answer

Q11. Sql Query Table Name Employee column Employee name Salary Department first read this csv file and then write the query in pyspark to find out the name of the employee whose salary is 2nd highest in eac...

Ans.

Find the 2nd highest salary employee in each department using PySpark.

Read the CSV file into a DataFrame using spark.read.csv().
Group the DataFrame by 'Department' and use the 'dense_rank()' function to rank salaries.
Filter the DataFrame to get employees with a rank of 2.
Select the 'Employee name' and 'Department' columns for the final output.

Answered by AI

Add your answer

Q12. Suppose you have string values now you have to find out the frequency of values ? For Example like input ['a' ,'a' ,'a', 'b', 'b', 'c' ] output a,3 b,2 c,1

Ans.

Calculate the frequency of each unique string in an array and display the results.

Use a dictionary to count occurrences: {'a': 3, 'b': 2, 'c': 1}.
Iterate through the list and update counts for each character.
Example: For input ['a', 'a', 'b'], output should be 'a,2' and 'b,1'.
Utilize collections.Counter for a more concise solution.

Answered by AI

Add your answer

Q13. What is case classes in python ?

Ans.

Case classes in Python are classes that are used to create immutable objects for pattern matching and data modeling.

Case classes are typically used in functional programming to represent data structures.
They are immutable, meaning their values cannot be changed once they are created.
Case classes automatically define equality, hash code, and toString methods based on the class constructor arguments.
They are commonly use...

Answered by AI

Add your answer

Q14. Suppose there is 100 column in a file i just want to only load 10 column from 100 column how you approach this?

Ans.

To load specific columns from a file, use data processing tools to filter the required columns efficiently.

Use libraries like Pandas in Python: `df = pd.read_csv('file.csv', usecols=['col1', 'col2', ...])`.
In SQL, you can specify columns in your SELECT statement: `SELECT col1, col2 FROM table_name;`.
For CSV files, tools like awk can be used: `awk -F, '{print $1,$2,...}' file.csv`.
In ETL processes, configure the extract...

Answered by AI

Add your answer

Q15. What is lambda Architecture and lambda function?

Ans.

Lambda Architecture is a data processing architecture designed to handle massive quantities of data by taking advantage of both batch and stream processing methods. Lambda function is a small anonymous function that can take any number of arguments, but can only have one expression.

Lambda Architecture combines batch processing and stream processing to handle large amounts of data efficiently.
Batch layer stores and proc...

Answered by AI

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare more around Pyspark and SQL

Skills evaluated in this interview

Data Engineering Analyst Interview Questions & Answers

Anonymous

posted on 29 Dec 2024

Interview experience

Bad

Difficulty level

Process Duration

Result

Round 1 - Coding Test

Coding in python use many tools scikit learn dashboarding such as tableau additionally I am skilled in ML

Interview Preparation Tips

Interview preparation tips for other job seekers - Practice your skills and do better for the better tomorrow

Data Engineering Analyst Interview Questions & Answers

Rajiv Kumar

posted on 7 Aug 2025

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

I appeared for an interview before Aug 2024, where I was asked the following questions.

Q1. What are parquet files

Ans.

Parquet files are columnar storage files optimized for big data processing and analytics.

Columnar storage format, allowing efficient data compression and encoding.
Designed for use with big data processing frameworks like Apache Hadoop and Apache Spark.
Supports complex nested data structures, making it suitable for various data types.
Parquet files can significantly reduce storage costs and improve query performance.
Exam...

Answered by AI

Add your answer

Q2. What are delta live tables

Ans.

Delta Live Tables are a framework for building reliable data pipelines in Databricks, enabling real-time data processing.

Delta Live Tables simplify ETL processes by automating data pipeline management.
They support incremental data processing, allowing for real-time updates.
Users can define data transformations using SQL or Python, making it accessible.
Example: A retail company can use Delta Live Tables to continuously ...

Answered by AI

Add your answer

Data Engineering Analyst Interview Questions & Answers

Anonymous

posted on 8 Feb 2024

Interview experience

Average

Difficulty level

Process Duration

More than 8 weeks

Result

Selected

I applied via Naukri.com and was interviewed before Feb 2023. There were 2 interview rounds.

Round 1 - Technical

(1 Question)

Q1. There was only one technical round and questions where from SQL number joins questions and tool related questions

Add your answer

Round 2 - HR

(1 Question)

Q1. CTC discussion with HR and offer Letter was released after submitting all documents

Add your answer

Data Engineering Analyst Interview Questions & Answers

Anonymous

posted on 24 Sep 2024

Interview experience

Average

Difficulty level

Moderate

Process Duration

4-6 weeks

Result

Selected

I applied via Company Website and was interviewed before Sep 2023. There were 2 interview rounds.

Round 1 - Aptitude Test

Reasoning,logical, grammatical

Round 2 - Technical

(2 Questions)

Q1. Self introduction

Add your answer

Q2. Academic Project explanation

Ans.

Developed a data analysis tool to predict customer churn using machine learning algorithms.

Used Python for data preprocessing and model building
Implemented logistic regression and random forest algorithms
Evaluated model performance using metrics like accuracy, precision, and recall

Answered by AI

Add your answer

Data Engineering Analyst Interview Questions & Answers

Anonymous

posted on 24 Dec 2022

Interview experience

Average

Difficulty level

Process Duration

Result

Round 1 - Resume Shortlist

Pro Tip by AmbitionBox:

Keep your resume crisp and to the point. A recruiter looks at your resume for an average of 6 seconds, make sure to leave the best impression.

View all tips

Round 2 - Coding Test

2 questions on basics of DS and algo. easy and medium level included.

Round 3 - Technical

(2 Questions)

Q1. Basic concept of OOP, data types in python , C

Add your answer

Q2. Explain your personal project briefly

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - DA ,algo basics, along with the personal project is enough

Data Engineering Analyst Interview Questions & Answers

Anonymous

posted on 4 Apr 2024

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

I applied via Campus Placement and was interviewed before Apr 2023. There were 2 interview rounds.

Round 1 - Aptitude Test

Aptitude questions, verbal test and pseudocode.

Round 2 - HR

(5 Questions)

Q1. About Myself and my life

Add your answer

Q2. About college project and academics

Add your answer

Q3. About leadership skills in college

Add your answer

Q4. About my skills sets

Add your answer

Q5. About my aspirations on career

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Its easy to crack with moderate preparation

Data Engineering Analyst Interview Questions & Answers

Anonymous

posted on 3 May 2024

Interview experience

Excellent

Difficulty level

Easy

Process Duration

Less than 2 weeks

Result

Selected

I appeared for an interview before May 2023.

Round 1 - Assignment

Basic aptitude questions and a couple of codes

Round 2 - Technical

(1 Question)

Q1. Details about project in college

Add your answer

Data Engineering Analyst Interview Questions & Answers

Sourabh Mallav

posted on 28 Mar 2024

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

I applied via LinkedIn and was interviewed before Mar 2023. There were 3 interview rounds.

Round 1 - Aptitude Test

That was great and easy

Round 2 - Coding Test

Gave 2 codes
Difficult level is medium

Round 3 - Technical

(1 Question)

Q1. Ask about project

Add your answer

Accenture Interview FAQs

How many rounds are there in Accenture Data Engineering Analyst interview?

Accenture interview process usually has 2-3 rounds. The most common rounds in the Accenture interview process are Technical, Aptitude Test and Resume Shortlist.

What are the top questions asked in Accenture Data Engineering Analyst interview?

Some of the top questions asked at the Accenture Data Engineering Analyst interview -

Sql Query Table Name Employee column Employee name Salary Department first r...read more
You have to 200 Petabyte of data to load how you will decide the number of exe...read more
Suppose there is 100 column in a file i just want to only load 10 column from 1...read more

How long is the Accenture Data Engineering Analyst interview process?

The duration of Accenture Data Engineering Analyst interview process can vary, but typically it takes about less than 2 weeks to complete.

Tell us how to improve this page.

Accenture Interviews By Designations

Interview Questions for Popular Designations

3.9/5

based on 14 interview experiences

Difficulty level

Easy 25%

Moderate 75%

Duration

Less than 2 weeks 67%

4-6 weeks 22%

More than 8 weeks 11%

TCS Data Engineering Analyst Interview Questions

3.5

• 11.3k Interviews

Infosys Data Engineering Analyst Interview Questions

3.6

• 8k Interviews

Wipro Data Engineering Analyst Interview Questions

3.7

• 6.2k Interviews

Cognizant Data Engineering Analyst Interview Questions

3.7

• 6k Interviews

Capgemini Data Engineering Analyst Interview Questions

3.7

• 5.1k Interviews

Tech Mahindra Data Engineering Analyst Interview Questions

3.5

• 4.2k Interviews

HCLTech Data Engineering Analyst Interview Questions

3.5

• 4.2k Interviews

Genpact Data Engineering Analyst Interview Questions

3.7

• 3.5k Interviews

IBM Data Engineering Analyst Interview Questions

3.9

• 2.5k Interviews

DXC Technology Data Engineering Analyst Interview Questions

3.6

• 848 Interviews

View all

Accenture Data Engineering Analyst Salary

based on 2.9k salaries

₹5 L/yr - ₹10 L/yr

10% less than the average Data Engineering Analyst Salary in India

View more details

Accenture Salaries in India

Application Development Analyst 39.3k salaries	₹4.8 L/yr - ₹11 L/yr
Application Development - Senior Analyst 27.7k salaries	₹8.1 L/yr - ₹16.1 L/yr
Team Lead 27.2k salaries	₹12.7 L/yr - ₹22.7 L/yr
Senior Analyst 20.3k salaries	₹9.1 L/yr - ₹15.7 L/yr
Associate Manager 18.6k salaries	₹20.6 L/yr - ₹36 L/yr