Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

VIEW WINNERS
- ABECA 2025
  
  VIEW WINNERS
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
Participate in ABECA 2026

Premium Employer

Infosys Work with us

Compare

3.6

based on 43.9k Reviews

Filter interviews by

Infosys Data Engineer Interview Questions and Answers

Updated 22 Jul 2025

19 Interview questions

A Data Engineer was asked 2w ago

Q. What is the concept of Pardo in dataflow?

Ans.

Pardo in dataflow refers to a parallel data processing model that optimizes performance and resource utilization.

Pardo stands for 'Parallel Do', enabling distributed processing of data across multiple nodes.
It allows for efficient handling of large datasets by breaking them into smaller chunks.
For example, in Apache Beam, Pardo can be used to apply a function to each element in a collection in parallel.
This model ...

A Data Engineer was asked 2w ago

Q. What is the Python program to calculate the number of trailing zeros in a factorial?

Ans.

Calculate trailing zeros in a factorial using Python by counting factors of 5 in the numbers leading to n.

Trailing zeros in n! are produced by factors of 10, which are made from pairs of 2 and 5.
Since there are usually more factors of 2 than 5, we only need to count the factors of 5.
The formula to calculate trailing zeros is: n // 5 + n // 25 + n // 125 + ... until n // 5^k is 0.
Example: For 100!, trailing zeros =...

A Data Engineer was asked 1mo ago

Q. How would you find the city with the highest revenue from multiple regions?

Ans.

Identify the city with the highest revenue by analyzing data from various regions.

Aggregate revenue data from all regions within the city.
Use SQL queries like 'SELECT city, SUM(revenue) FROM sales GROUP BY city ORDER BY SUM(revenue) DESC LIMIT 1;'
Consider factors like population, economic activity, and industry presence in each region.
Example: If Region A has $1M and Region B has $2M, the total for the city is $3M...

A Data Engineer was asked 1mo ago

Q. What is the SQL query to find the differences between the current day's sales and the previous day's sales?

Ans.

SQL query to compare today's sales with yesterday's sales using aggregation and date functions.

Use a table with sales data that includes a date column.
Aggregate sales by date using SUM() function.
Use a Common Table Expression (CTE) or subquery to get sales for today and yesterday.
Calculate the difference between today's and yesterday's sales.

A Data Engineer was asked 2mo ago

Q. What optimization techniques are used in Spark?

Ans.

Spark optimization techniques enhance performance and resource utilization in distributed data processing tasks.

Use DataFrames and Datasets for optimized execution plans.
Leverage Catalyst Optimizer for query optimization.
Apply Tungsten for memory management and code generation.
Utilize partitioning to minimize data shuffling, e.g., using 'repartition' or 'coalesce'.
Cache intermediate results with 'persist()' to avo...

A Data Engineer was asked 4mo ago

Q. What is the process of performing ETL (Extract, Transform, Load), and can you provide an example?

Ans.

ETL is a data integration process that involves extracting data, transforming it for analysis, and loading it into a target system.

Extract: Gather data from various sources like databases, APIs, or flat files. Example: Pulling customer data from a CRM system.
Transform: Clean and format the data to meet business requirements. Example: Converting date formats or aggregating sales data.
Load: Insert the transformed da...

A Data Engineer was asked 4mo ago

Q. How do you join two datasets when one is significantly larger than the other?

Ans.

Efficiently joining large and small datasets requires strategic approaches to optimize performance and resource usage.

Use a distributed computing framework like Apache Spark to handle large datasets efficiently.
Consider filtering the larger dataset before the join to reduce its size, e.g., using a WHERE clause.
Leverage indexing on the join keys to speed up the join operation.
Use a broadcast join if the smaller dat...

Are these interview questions helpful?

A Data Engineer was asked 4mo ago

Q. What is incremental data loading?

Ans.

Incremental data loading is a process of updating a database with only new or changed data since the last load.

Reduces data transfer time by only loading new or modified records.
Commonly used in ETL (Extract, Transform, Load) processes.
Example: Loading only new customer records added since the last update.
Helps in maintaining data consistency and reducing redundancy.
Can be implemented using timestamps or change da...

A Data Engineer was asked 10mo ago

Q. Write code to determine if a given string is a palindrome.

Ans.

A palindrome is a word, phrase, number, or other sequence of characters that reads the same forward and backward.

Check if the string is equal to its reverse to determine if it's a palindrome.
Ignore spaces and punctuation when checking for palindromes.
Convert the string to lowercase before checking for palindromes.
Examples: 'racecar', 'A man, a plan, a canal, Panama'

A Data Engineer was asked

Q. Describe a project you handled in your last organization.

Ans.

Developed a data pipeline to ingest, process, and analyze customer feedback data for product improvement.

Designed and implemented ETL processes to extract data from various sources
Utilized Apache Spark for data processing and analysis
Built data visualizations to present insights to stakeholders

Infosys Data Engineer Interview Experiences

31 interviews found

Data Engineer Interview Questions & Answers

Anonymous

posted on 11 Jun 2025

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

No response

I appeared for an interview in May 2025, where I was asked the following questions.

Q1. Can you provide an explanation of your project?

Ans.

Developed a data pipeline for processing and analyzing large datasets from various sources to support business intelligence.

Designed ETL processes to extract data from APIs and databases, ensuring data integrity.
Utilized Apache Spark for distributed data processing, improving performance by 30%.
Implemented data warehousing solutions using Amazon Redshift for efficient querying.
Created dashboards in Tableau for visualiz...

Answered by AI

Add your answer

Q2. What is the SQL query to find the differences between the current day's sales and the previous day's sales?

Ans.

SQL query to compare today's sales with yesterday's sales using aggregation and date functions.

Use a table with sales data that includes a date column.
Aggregate sales by date using SUM() function.
Use a Common Table Expression (CTE) or subquery to get sales for today and yesterday.
Calculate the difference between today's and yesterday's sales.

Answered by AI

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 19 Sep 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

2-4 weeks

Result

Selected

I applied via Naukri.com

Round 1 - Technical

(2 Questions)

Q1. Basics of ADF ADB

Add your answer

Q2. Code on Palindrome

Add your answer

Round 2 - HR

(1 Question)

Q1. About current role

Add your answer

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

Anonymous

posted on 19 Jun 2025

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

2-4 weeks

Result

I appeared for an interview in May 2025, where I was asked the following questions.

Q1. 1)Highest Revenue collected in a city among multiple regions

Ans.

Identify the city with the highest revenue by analyzing data from various regions.

Aggregate revenue data from all regions within the city.
Use SQL queries like 'SELECT city, SUM(revenue) FROM sales GROUP BY city ORDER BY SUM(revenue) DESC LIMIT 1;'
Consider factors like population, economic activity, and industry presence in each region.
Example: If Region A has $1M and Region B has $2M, the total for the city is $3M.

Answered by AI

Add your answer

Q2. Spark, Cloudera, Autosys, Kerebros

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 20 Jul 2024

Interview experience

Average

Difficulty level

Process Duration

Result

Round 1 - Technical

(2 Questions)

Q1. Spark optimization tech

Add your answer

Q2. Windows function swl

Add your answer

Round 2 - Technical

(2 Questions)

Q1. Cloud related que

Add your answer

Q2. Project related que

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 14 Nov 2024

Interview experience

Average

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Not Selected

I applied via Naukri.com and was interviewed in Oct 2024. There was 1 interview round.

Round 1 - Aptitude Test

DSA question was asked

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare DSA questions

Data Engineer Interview Questions & Answers

Anonymous

posted on 17 Dec 2024

Interview experience

Average

Difficulty level

Process Duration

Result

Round 1 - Technical

(1 Question)

Q1. Fact vs dimension table

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Good luck! All the best! It's easy!

Data Engineer Interview Questions & Answers

Anonymous

posted on 23 Mar 2025

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

I appeared for an interview in Feb 2025, where I was asked the following questions.

Q1. What is incremental data loading?

Add your answer

Q2. Drop vs truncate

Ans.

DROP removes a table permanently; TRUNCATE deletes all rows but retains the table structure.

DROP TABLE table_name; - Completely removes the table and its data.
TRUNCATE TABLE table_name; - Deletes all rows but keeps the table structure.
DROP cannot be rolled back if not in a transaction; TRUNCATE can be rolled back if in a transaction.
TRUNCATE is usually faster than DROP because it doesn't log individual row deletions.

Answered by AI

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 10 Jun 2025

Interview experience

Poor

Difficulty level

Moderate

Process Duration

2-4 weeks

Result

Not Selected

I appeared for an interview in Dec 2024, where I was asked the following questions.

Q1. Introduce yourself

Add your answer

Q2. What are Optimization techniques used in spark

Ans.

Spark optimization techniques enhance performance and resource utilization in distributed data processing tasks.

Use DataFrames and Datasets for optimized execution plans.
Leverage Catalyst Optimizer for query optimization.
Apply Tungsten for memory management and code generation.
Utilize partitioning to minimize data shuffling, e.g., using 'repartition' or 'coalesce'.
Cache intermediate results with 'persist()' to avoid re...

Answered by AI

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 26 Mar 2025

Interview experience

Excellent

Difficulty level

Process Duration

Result

Q1. How to join 2 datasets if one is much larger than the other

Add your answer

Q2. What is the process of performing ETL (Extract, Transform, Load), and can you provide an example?

Ans.

ETL is a data integration process that involves extracting data, transforming it for analysis, and loading it into a target system.

Extract: Gather data from various sources like databases, APIs, or flat files. Example: Pulling customer data from a CRM system.
Transform: Clean and format the data to meet business requirements. Example: Converting date formats or aggregating sales data.
Load: Insert the transformed data in...

Answered by AI

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 10 Sep 2024

Interview experience

Average

Difficulty level

Process Duration

2-4 weeks

Result

I applied via Company Website and was interviewed in Mar 2024. There were 3 interview rounds.

Round 1 - Coding Test

It went fine and interactive

Round 2 - Technical

(2 Questions)

Q1. Snowflake architecture

Add your answer

Q2. Caching , different data loading techniques

Add your answer

Round 3 - HR

(2 Questions)

Q1. Salary expectations

Add your answer

Q2. Day to day work in my prev company

Add your answer

What people are saying about Infosys

View All

lesspine

works at

Infosys

Seeking insights on TCS offer letter

Hii All, I have attended interview for service desk role in tcs at the end of June my tech round and managerial round later I have submitted all my documents in ibegin portal all are showing verified in the portal and in the 2nd week of July I have completed my hr round and later multiple follow ups given update like internal approvals will take time it's been more than month I have contacted hr and he said like internal approvals will take time I asked will it be a month she said it will take more than a month no clear timeline. So will I get offer letter or not seeking insights on this.

Got a question about Infosys?

Ask anonymously on communities.

Infosys Interview FAQs

How many rounds are there in Infosys Data Engineer interview?

Infosys interview process usually has 1-2 rounds. The most common rounds in the Infosys interview process are Technical, HR and Coding Test.

How to prepare for Infosys Data Engineer interview?

Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at Infosys. The most common topics and skills that interviewers at Infosys expect are Spark, Python, Big Data, ETL and Hive.

What are the top questions asked in Infosys Data Engineer interview?

Some of the top questions asked at the Infosys Data Engineer interview -

What is the SQL query to find the differences between the current day's sales a...read more
Python dataframes and how we use them in project and where at t...read more
What is the process of performing ETL (Extract, Transform, Load), and can you p...read more

How long is the Infosys Data Engineer interview process?

The duration of Infosys Data Engineer interview process can vary, but typically it takes about 2-4 weeks to complete.

Tell us how to improve this page.

Infosys Interviews By Designations

Interview Questions for Popular Designations

3.7/5

based on 29 interview experiences

Difficulty level

Easy 15%

Moderate 85%

Duration

Less than 2 weeks 46%

2-4 weeks 54%

Join Infosys Creating the next opportunity for people, businesses & communities

TCS Data Engineer Interview Questions

3.5

• 11.2k Interviews

Accenture Data Engineer Interview Questions

3.7

• 8.7k Interviews

Wipro Data Engineer Interview Questions

3.7

• 6.2k Interviews

Cognizant Data Engineer Interview Questions

3.7

• 6k Interviews

Capgemini Data Engineer Interview Questions

3.7

• 5.1k Interviews

Tech Mahindra Data Engineer Interview Questions

3.5

• 4.2k Interviews

HCLTech Data Engineer Interview Questions

3.5

• 4.2k Interviews

Genpact Data Engineer Interview Questions

3.7

• 3.5k Interviews

LTIMindtree Data Engineer Interview Questions

3.7

• 3.1k Interviews

IBM Data Engineer Interview Questions

3.9

• 2.5k Interviews

View all

Infosys Data Engineer Salary

based on 1.8k salaries

₹4.5 L/yr - ₹10.8 L/yr

36% less than the average Data Engineer Salary in India

View more details

Data Engineer Jobs at Infosys

Celonis Data Engineer/Consultant

Bangalore / Bengaluru

3-5 Yrs

₹ 3.5-13 LPA

Databricks Data Engineer

Bangalore / Bengaluru

5-9 Yrs

Not Disclosed

Azure Databricks Data Engineer/Developer

Bangalore / Bengaluru

3-8 Yrs

Not Disclosed

Explore more jobs

Infosys Salaries in India

Technology Analyst 55.1k salaries	₹4.8 L/yr - ₹10 L/yr
Senior Systems Engineer 54.4k salaries	₹2.5 L/yr - ₹6.3 L/yr
Technical Lead 35.4k salaries	₹9.5 L/yr - ₹16.5 L/yr
System Engineer 32.6k salaries	₹2.4 L/yr - ₹6 L/yr
Senior Associate Consultant 32.4k salaries	₹8.3 L/yr - ₹15 L/yr