Data Engineer

Data Engineer Interview Questions

Updated 19 Apr 2024

Most Searched Companies

1530 results found

Sort By: 

Popularity

Interview Questions

+40 interview questions

Interview Questions

  • Q1. do you have experience in aws glue, how will you use glue for data migration?

    View answer (1)
  • Q2. How do you select the unique customers in the last 3 months sales

    View answer (1)
  • Q3. 1.What is partition and bucketing. 2.diffrenece between union and union all. 3.spark architecture. 4.manage and external table in hive and diffirence. 5.sql and pyhton problem basic

    View answer (1)
  • Q4. how to migrate data from local server to AWS Redshift

    View answer (1)
  • Q5. how you ingest your data in pipeline?

    View answer (1)
  • Q6. how migrate data from local server to AWS redshift

    View answer (1)
  • Q7. How can we join a table without any identity columns?

    View answer (1)
  • Q8. What is MERGE Statement used for?

    View answer (1)
  • Q9. How to add column in a df?

    View answer (1)
  • Q10. I had bad experience with Tcs , I'm not expecting this kind of interview..

    Add Answer

+55 interview questions

Interview Questions

  • Q1. 1. What is columnar storage,parquet,delta? Why it is used

    View answer (1)
  • Q2. 4. How to connect SQL server to databricks

    View answer (1)
  • Q3. 2.list ,tuple,set in python 3.sql groupby and window function ,union

    Add Answer
  • Q4. How do you handle changing schema from source. What are the common issues faced in hadoop and how did you resolve it?

    View answer (1)
  • Q5. Write Pyspark code to read csv file and show top 10 records.

    View answer (2)
  • Q6. RDDs vs DataFrames. Which is better and why

    View answer (1)
  • Q7. 3. Explain detail project architecture

    View answer (1)
  • Q8. It's question on collect list - it should be a straight question instead it was asked like comma separated and in a vague way.

    Add Answer
  • Q9. What are the optimization techniques applied in pyspark code?

    View answer (3)
  • Q10. Write function to check if number is an Armstrong Number

    View answer (1)

+9 interview questions

Interview Questions

  • Q1. 1.ABOUT hdfs ,Spark context, databrciks 2.about projects 3. SQL queries 4. python question

    Add Answer
  • Q2. 1) How to handle data skewness in spark.

    View answer (5)
  • Q3. 5) How to create a kafka topic with replication factor 2

    View answer (1)
  • Q4. 1) Project Architecture 2) Complex job handles in project 3) Types of lookup 4) SCD -2 implementation in datastage 5) sql - analytical functions,scenario based question 6) Unix - SED/GREP command

    View answer (1)
  • Q5. What do you know about Forms and Templates and its use in workflow and webreports

    View answer (1)
  • Q6. 4) How to read json data using spark

    View answer (3)
  • Q7. 2) Difference between partitioning and Bucketing

    View answer (2)
  • Q8. Asked technology related questions on spark and scala

    Add Answer
  • Q9. 3) Difference between cache and persistent storage

    View answer (3)
  • Q10. 1. about hdfs 2.sql 3.pyhon

    Add Answer

+27 interview questions

Interview Questions

+33 interview questions

Interview Questions

  • Q1. HR asked how was the interview last round and gave me feedback

    Add Answer
  • Q2. questions on AWS or azure depending upon the platform you work. more conceptual questions on data engineering.

    Add Answer
  • Q3. Basic SQL questions and Window fuctions

    Add Answer
  • Q4. R2: Was the toughest, asked for 40 mins and more than 30 questions were asked, lucky enough to answer all of them.

    Add Answer
  • Q5. R1 was about 40 mins and was a bit tough. Asked more about ADF, ADB, SQL, SQL DW

    Add Answer
  • Q6. The interviewer was really nice and that went so smooth for me. 1. Asked questions from IC Engines and stuff 😄, as I was from mechanical 2.In coding, Sorting and Oops concepts were asked 3. Guessimates I couldn't recall more as it has been an year and half since it happened

    Add Answer
  • Q7. Round 1: Coding round, Round 2: Basic and conceptual from Spark and Hive, Round 3: In Depth question from Spark and Hive, Write code in Spark and Hive.

    Add Answer
  • Q8. Joins in Sql, Modelling and visualization part in PowerBI

    View answer (1)
  • Q9. Cumulative sum and rank functions in spark

    View answer (1)
  • Q10. Basic spark questions like optimization techniques

    Add Answer

+13 interview questions

Interview Questions

  • Q1. Assume We had a PAN india Retail store because of which i have customer table in backend one is customer profile table and other is customer transaction table both will linked with customer id so what will the approach to find the name of customer who will do the transaction ? which is computationally efficient to solve this left join or subquery ?

    Add Answer
  • Q2. Give the Case Study How you develop Dream 11 like cricket App as a team of Data Engineer ,Data Analyst ,Data Scientest ,Data Architecture and ETL Developer ?what kind of tables you have to you use for structural database for this app?what all the dataset you have to create ?how we find the probability of winning or loosing team?what is the streaming table in this app?(Real time Scenerios you have to look out as a Data Engineer)

    Add Answer
  • Q3. What is the Difference between Transformation and Actions in pyspark? And Give Example

    Add Answer
  • Q4. what is Common Expression Query (CTE)?How CTE is different from Stored Procedure?

    Add Answer
  • Q5. what if you have to find out second highest transacting member in each city?

    Add Answer
  • Q6. what is Normalization is sql and explain 1NF 2NF 3NF?

    Add Answer
  • Q7. design a business case to use self join? Condition : not use hirachical usecase like teacher student employee manager father and grandfather

    Add Answer
  • Q8. how subquery is work on backend on the above question?

    Add Answer
  • Q9. Have you work on Lambda Function Explain it?

    Add Answer
  • Q10. what is difference between alter and update ?

    Add Answer

+4 interview questions

Interview Questions

+23 interview questions

Interview Questions

  • Q1. What will be spark configuration to process 2 gb of data

    View answer (1)
  • Q2. How you will run a child notebook into a parent notebook using dbutils command

    View answer (1)
  • Q3. Basic questions on pyspark , spark architecture, 2 coding questions one sql ,, one pyspark.

    Add Answer
  • Q4. 1.Partitioning and bucketing 2.Few SQL questions 3.ADF scenario questions

    Add Answer
  • Q5. What is Lazy evaluation in spark

    View answer (1)
  • Q6. Difference between Left join and inner join

    View answer (1)
  • Q7. what is BQ what are advantages

    View answer (1)
  • Q8. Git & Jenkins - like git commands how do you deploy your code using Jenkins

    Add Answer
  • Q9. ques on GCP, python , sql,github

    Add Answer
  • Q10. sql joins and sql select query

    Add Answer

+28 interview questions

Interview Questions

  • Q1. SQL - Given 2 tables with some nullls and asked to output the count of rows we get for all types of joins

    Add Answer
  • Q2. types of joins and explain cross join

    View answer (1)
  • Q3. This round was about SQL, Python the interviewer gave a problem and asked to write SQL query and they tested my SQL skills on basic concepts.

    Add Answer
  • Q4. The interviewer asked to write python code for basic problems. This round was lasted about 40-45 mins

    Add Answer
  • Q5. Again this round was focussed on my SQL skills the interviewer asked to write queries. SQL medium-hard level concepts. This lasted for 40-45 mins

    Add Answer
  • Q6. This was final round the interviewer asked about everything I mentioned in my resume, behavioural questions, situational questions etc

    Add Answer
  • Q7. Scenario based questions

    Add Answer
  • Q8. Study spark sql data warehousing

    Add Answer
  • Q9. Real life example

    Add Answer
  • Q10. Study DSA SQL PYTHON

    Add Answer

+11 interview questions