Premium Employer

i

This company page is being actively managed by Infosys Team. If you also belong to the team, you can get access from here

Infosys Verified Tick Work with us arrow

Compare button icon Compare button icon Compare

Filter interviews by

Infosys Data Engineer Interview Questions and Answers

Updated 22 Jul 2025

19 Interview questions

A Data Engineer was asked 2w ago
Q. What is the concept of Pardo in dataflow?
Ans. 

Pardo in dataflow refers to a parallel data processing model that optimizes performance and resource utilization.

  • Pardo stands for 'Parallel Do', enabling distributed processing of data across multiple nodes.

  • It allows for efficient handling of large datasets by breaking them into smaller chunks.

  • For example, in Apache Beam, Pardo can be used to apply a function to each element in a collection in parallel.

  • This model ...

A Data Engineer was asked 2w ago
Q. What is the Python program to calculate the number of trailing zeros in a factorial?
Ans. 

Calculate trailing zeros in a factorial using Python by counting factors of 5 in the numbers leading to n.

  • Trailing zeros in n! are produced by factors of 10, which are made from pairs of 2 and 5.

  • Since there are usually more factors of 2 than 5, we only need to count the factors of 5.

  • The formula to calculate trailing zeros is: n // 5 + n // 25 + n // 125 + ... until n // 5^k is 0.

  • Example: For 100!, trailing zeros =...

Data Engineer Interview Questions Asked at Other Companies

asked in Sigmoid
Q1. Next Greater Element Problem Statement You are given an array arr ... read more
asked in LTIMindtree
Q2. If you are given cards numbered 1-1000 and 4 boxes, where card 1 ... read more
asked in Cisco
Q3. Optimal Strategy for a Coin Game You are playing a coin game with ... read more
asked in Sigmoid
Q4. K-th Element of Two Sorted Arrays You are provided with two sorte ... read more
asked in Sigmoid
Q5. Problem: Search In Rotated Sorted Array Given a sorted array that ... read more
A Data Engineer was asked 1mo ago
Q. How would you find the city with the highest revenue from multiple regions?
Ans. 

Identify the city with the highest revenue by analyzing data from various regions.

  • Aggregate revenue data from all regions within the city.

  • Use SQL queries like 'SELECT city, SUM(revenue) FROM sales GROUP BY city ORDER BY SUM(revenue) DESC LIMIT 1;'

  • Consider factors like population, economic activity, and industry presence in each region.

  • Example: If Region A has $1M and Region B has $2M, the total for the city is $3M...

A Data Engineer was asked 1mo ago
Q. What is the SQL query to find the differences between the current day's sales and the previous day's sales?
Ans. 

SQL query to compare today's sales with yesterday's sales using aggregation and date functions.

  • Use a table with sales data that includes a date column.

  • Aggregate sales by date using SUM() function.

  • Use a Common Table Expression (CTE) or subquery to get sales for today and yesterday.

  • Calculate the difference between today's and yesterday's sales.

A Data Engineer was asked 2mo ago
Q. What optimization techniques are used in Spark?
Ans. 

Spark optimization techniques enhance performance and resource utilization in distributed data processing tasks.

  • Use DataFrames and Datasets for optimized execution plans.

  • Leverage Catalyst Optimizer for query optimization.

  • Apply Tungsten for memory management and code generation.

  • Utilize partitioning to minimize data shuffling, e.g., using 'repartition' or 'coalesce'.

  • Cache intermediate results with 'persist()' to avo...

What are the roles & responsibilities of a Data Engineer at Infosys?

Data Pipeline Development

  • Design and build modern data pipelines and data streams
  • Develop and maintain ETL processes
  • Move/Transform data across layers using ADF and PySpark

Read full roles & responsibilities

A Data Engineer was asked 4mo ago
Q. What is the process of performing ETL (Extract, Transform, Load), and can you provide an example?
Ans. 

ETL is a data integration process that involves extracting data, transforming it for analysis, and loading it into a target system.

  • Extract: Gather data from various sources like databases, APIs, or flat files. Example: Pulling customer data from a CRM system.

  • Transform: Clean and format the data to meet business requirements. Example: Converting date formats or aggregating sales data.

  • Load: Insert the transformed da...

Infosys HR Interview Questions

880 questions and answers

Q. How have you addressed security concerns in your project?
Q. Explain your last project.
Q. What aspects of your resume would you like to highlight?
A Data Engineer was asked 4mo ago
Q. How do you join two datasets when one is significantly larger than the other?
Ans. 

Efficiently joining large and small datasets requires strategic approaches to optimize performance and resource usage.

  • Use a distributed computing framework like Apache Spark to handle large datasets efficiently.

  • Consider filtering the larger dataset before the join to reduce its size, e.g., using a WHERE clause.

  • Leverage indexing on the join keys to speed up the join operation.

  • Use a broadcast join if the smaller dat...

Are these interview questions helpful?
A Data Engineer was asked 4mo ago
Q. What is incremental data loading?
Ans. 

Incremental data loading is a process of updating a database with only new or changed data since the last load.

  • Reduces data transfer time by only loading new or modified records.

  • Commonly used in ETL (Extract, Transform, Load) processes.

  • Example: Loading only new customer records added since the last update.

  • Helps in maintaining data consistency and reducing redundancy.

  • Can be implemented using timestamps or change da...

A Data Engineer was asked 10mo ago
Q. Write code to determine if a given string is a palindrome.
Ans. 

A palindrome is a word, phrase, number, or other sequence of characters that reads the same forward and backward.

  • Check if the string is equal to its reverse to determine if it's a palindrome.

  • Ignore spaces and punctuation when checking for palindromes.

  • Convert the string to lowercase before checking for palindromes.

  • Examples: 'racecar', 'A man, a plan, a canal, Panama'

A Data Engineer was asked
Q. Describe a project you handled in your last organization.
Ans. 

Developed a data pipeline to ingest, process, and analyze customer feedback data for product improvement.

  • Designed and implemented ETL processes to extract data from various sources

  • Utilized Apache Spark for data processing and analysis

  • Built data visualizations to present insights to stakeholders

Infosys Data Engineer Interview Experiences

31 interviews found

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 11 Jun 2025

Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
No response

I appeared for an interview in May 2025, where I was asked the following questions.

  • Q1. Can you provide an explanation of your project?
  • Ans. 

    Developed a data pipeline for processing and analyzing large datasets from various sources to support business intelligence.

    • Designed ETL processes to extract data from APIs and databases, ensuring data integrity.

    • Utilized Apache Spark for distributed data processing, improving performance by 30%.

    • Implemented data warehousing solutions using Amazon Redshift for efficient querying.

    • Created dashboards in Tableau for visualiz...

  • Answered by AI
  • Q2. What is the SQL query to find the differences between the current day's sales and the previous day's sales?
  • Ans. 

    SQL query to compare today's sales with yesterday's sales using aggregation and date functions.

    • Use a table with sales data that includes a date column.

    • Aggregate sales by date using SUM() function.

    • Use a Common Table Expression (CTE) or subquery to get sales for today and yesterday.

    • Calculate the difference between today's and yesterday's sales.

  • Answered by AI

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 19 Sep 2024

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
2-4 weeks
Result
Selected Selected

I applied via Naukri.com

Round 1 - Technical 

(2 Questions)

  • Q1. Basics of ADF ADB
  • Q2. Code on Palindrome
Round 2 - HR 

(1 Question)

  • Q1. About current role

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 19 Jun 2025

Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
2-4 weeks
Result
-

I appeared for an interview in May 2025, where I was asked the following questions.

  • Q1. 1)Highest Revenue collected in a city among multiple regions
  • Ans. 

    Identify the city with the highest revenue by analyzing data from various regions.

    • Aggregate revenue data from all regions within the city.

    • Use SQL queries like 'SELECT city, SUM(revenue) FROM sales GROUP BY city ORDER BY SUM(revenue) DESC LIMIT 1;'

    • Consider factors like population, economic activity, and industry presence in each region.

    • Example: If Region A has $1M and Region B has $2M, the total for the city is $3M.

  • Answered by AI
  • Q2. Spark, Cloudera, Autosys, Kerebros

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 20 Jul 2024

Interview experience
3
Average
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. Spark optimization tech
  • Q2. Windows function swl
Round 2 - Technical 

(2 Questions)

  • Q1. Cloud related que
  • Q2. Project related que

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 14 Nov 2024

Interview experience
3
Average
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Naukri.com and was interviewed in Oct 2024. There was 1 interview round.

Round 1 - Aptitude Test 

DSA question was asked

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare DSA questions

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 17 Dec 2024

Interview experience
3
Average
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(1 Question)

  • Q1. Fact vs dimension table

Interview Preparation Tips

Interview preparation tips for other job seekers - Good luck! All the best! It's easy!

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 23 Mar 2025

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
-

I appeared for an interview in Feb 2025, where I was asked the following questions.

  • Q1. What is incremental data loading?
  • Q2. Drop vs truncate
  • Ans. 

    DROP removes a table permanently; TRUNCATE deletes all rows but retains the table structure.

    • DROP TABLE table_name; - Completely removes the table and its data.

    • TRUNCATE TABLE table_name; - Deletes all rows but keeps the table structure.

    • DROP cannot be rolled back if not in a transaction; TRUNCATE can be rolled back if in a transaction.

    • TRUNCATE is usually faster than DROP because it doesn't log individual row deletions.

  • Answered by AI

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 10 Jun 2025

Interview experience
2
Poor
Difficulty level
Moderate
Process Duration
2-4 weeks
Result
Not Selected

I appeared for an interview in Dec 2024, where I was asked the following questions.

  • Q1. Introduce yourself
  • Q2. What are Optimization techniques used in spark
  • Ans. 

    Spark optimization techniques enhance performance and resource utilization in distributed data processing tasks.

    • Use DataFrames and Datasets for optimized execution plans.

    • Leverage Catalyst Optimizer for query optimization.

    • Apply Tungsten for memory management and code generation.

    • Utilize partitioning to minimize data shuffling, e.g., using 'repartition' or 'coalesce'.

    • Cache intermediate results with 'persist()' to avoid re...

  • Answered by AI

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 26 Mar 2025

Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
  • Q1. How to join 2 datasets if one is much larger than the other
  • Q2. What is the process of performing ETL (Extract, Transform, Load), and can you provide an example?
  • Ans. 

    ETL is a data integration process that involves extracting data, transforming it for analysis, and loading it into a target system.

    • Extract: Gather data from various sources like databases, APIs, or flat files. Example: Pulling customer data from a CRM system.

    • Transform: Clean and format the data to meet business requirements. Example: Converting date formats or aggregating sales data.

    • Load: Insert the transformed data in...

  • Answered by AI

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 10 Sep 2024

Interview experience
3
Average
Difficulty level
-
Process Duration
2-4 weeks
Result
-

I applied via Company Website and was interviewed in Mar 2024. There were 3 interview rounds.

Round 1 - Coding Test 

It went fine and interactive

Round 2 - Technical 

(2 Questions)

  • Q1. Snowflake architecture
  • Q2. Caching , different data loading techniques
Round 3 - HR 

(2 Questions)

  • Q1. Salary expectations
  • Q2. Day to day work in my prev company

What people are saying about Infosys

View All
lesspine
Verified Icon
5d
works at
Infosys
Seeking insights on TCS offer letter
Hii All, I have attended interview for service desk role in tcs at the end of June my tech round and managerial round later I have submitted all my documents in ibegin portal all are showing verified in the portal and in the 2nd week of July I have completed my hr round and later multiple follow ups given update like internal approvals will take time it's been more than month I have contacted hr and he said like internal approvals will take time I asked will it be a month she said it will take more than a month no clear timeline. So will I get offer letter or not seeking insights on this.
Got a question about Infosys?
Ask anonymously on communities.

Infosys Interview FAQs

How many rounds are there in Infosys Data Engineer interview?
Infosys interview process usually has 1-2 rounds. The most common rounds in the Infosys interview process are Technical, HR and Coding Test.
How to prepare for Infosys Data Engineer interview?
Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at Infosys. The most common topics and skills that interviewers at Infosys expect are Spark, Python, Big Data, ETL and Hive.
What are the top questions asked in Infosys Data Engineer interview?

Some of the top questions asked at the Infosys Data Engineer interview -

  1. What is the SQL query to find the differences between the current day's sales a...read more
  2. Python dataframes and how we use them in project and where at t...read more
  3. What is the process of performing ETL (Extract, Transform, Load), and can you p...read more
How long is the Infosys Data Engineer interview process?

The duration of Infosys Data Engineer interview process can vary, but typically it takes about 2-4 weeks to complete.

Tell us how to improve this page.

Overall Interview Experience Rating

3.7/5

based on 29 interview experiences

Difficulty level

Easy 15%
Moderate 85%

Duration

Less than 2 weeks 46%
2-4 weeks 54%
View more
Join Infosys Creating the next opportunity for people, businesses & communities

Interview Questions from Similar Companies

TCS Data Engineer Interview Questions
3.5
 • 11.2k Interviews
Wipro Data Engineer Interview Questions
3.7
 • 6.2k Interviews
HCLTech Data Engineer Interview Questions
3.5
 • 4.2k Interviews
Genpact Data Engineer Interview Questions
3.7
 • 3.5k Interviews
IBM Data Engineer Interview Questions
3.9
 • 2.5k Interviews
View all
Infosys Data Engineer Salary
based on 1.8k salaries
₹4.5 L/yr - ₹10.8 L/yr
36% less than the average Data Engineer Salary in India
View more details

Infosys Data Engineer Reviews and Ratings

based on 104 reviews

3.5/5

Rating in categories

3.5

Skill development

3.6

Work-life balance

2.4

Salary

3.9

Job security

3.5

Company culture

2.4

Promotions

3.3

Work satisfaction

Explore 104 Reviews and Ratings
Celonis Data Engineer/Consultant

Bangalore / Bengaluru

3-5 Yrs

₹ 3.5-13 LPA

Databricks Data Engineer

Bangalore / Bengaluru

5-9 Yrs

Not Disclosed

Azure Databricks Data Engineer/Developer

Bangalore / Bengaluru

3-8 Yrs

Not Disclosed

Explore more jobs
Technology Analyst
55.1k salaries
unlock blur

₹4.8 L/yr - ₹10 L/yr

Senior Systems Engineer
54.4k salaries
unlock blur

₹2.5 L/yr - ₹6.3 L/yr

Technical Lead
35.4k salaries
unlock blur

₹9.5 L/yr - ₹16.5 L/yr

System Engineer
32.6k salaries
unlock blur

₹2.4 L/yr - ₹6 L/yr

Senior Associate Consultant
32.4k salaries
unlock blur

₹8.3 L/yr - ₹15 L/yr

Explore more salaries
Compare Infosys with

TCS

3.5
Compare

Wipro

3.7
Compare

Cognizant

3.7
Compare

Accenture

3.7
Compare
write
Share an Interview