Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

VIEW WINNERS
- ABECA 2025
  
  VIEW WINNERS
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
Participate in ABECA 2026

Add office photos

Engaged Employer

Cognizant

Compare

3.7

based on 54.7k Reviews

Video summary

Filter interviews by

Cognizant Pyspark Developer Interview Questions and Answers

Updated 30 Dec 2024

8 Interview questions

A Pyspark Developer was asked 5mo ago

Q. What is the difference between coalesce and repartition in data processing?

Ans.

Coalesce reduces the number of partitions without shuffling data, while repartition reshuffles data to create a specific number of partitions.

Coalesce is used to reduce the number of partitions without shuffling data
Repartition is used to increase or decrease the number of partitions by shuffling data
Coalesce is more efficient when reducing partitions as it avoids shuffling
Repartition is useful when you need to ex...

A Pyspark Developer was asked 5mo ago

Q. What is the difference between a DataFrame and an RDD (Resilient Distributed Dataset)?

Ans.

DataFrame is a higher-level abstraction built on top of RDD, providing more structure and optimization capabilities.

DataFrames are distributed collections of data organized into named columns, similar to tables in a relational database.
RDDs are lower-level abstractions representing a collection of objects distributed across a cluster, with no inherent structure.
DataFrames provide optimizations like query optimizat...

A Pyspark Developer was asked 6mo ago

Q. What is the SQL code for calculating year-on-year growth percentage with year-wise grouping?

Ans.

The SQL code for calculating year-on-year growth percentage with year-wise grouping.

Use the LAG function to get the previous year's value
Calculate the growth percentage using the formula: ((current year value - previous year value) / previous year value) * 100
Group by year to get year-wise growth percentage

A Pyspark Developer was asked 6mo ago

Q. What is the SQL query to find the second highest rank in a dataset?

Ans.

SQL query to find the second highest rank in a dataset

Use the ORDER BY clause to sort the ranks in descending order
Use the LIMIT and OFFSET clauses to skip the highest rank and retrieve the second highest rank
Example: SELECT rank FROM dataset ORDER BY rank DESC LIMIT 1 OFFSET 1

What people are saying about Cognizant

View All

a junior software engineer

Job offer in Malaysia - legit or scam?

Hey everyone, I received a job proposal from Mindgraph for a Junior Mainframe Developer position in Malaysia (onsite). Not sure if it's a real deal. They found my resume on Naukri and the offer includes: * Experience: 3+ years on cardlink, VSAM, CICS, JCL * Location: Malaysia (Accenture client in Kuala Lumpur) * Notice: 0-60 days * Benefits: One-way ticket, 1-week stay, medical insurance, visa. Has anyone heard of Mindgraph or had a similar experience? Note : This is a permanent position with Mindgragh and you need to work with our client Accenture - Malaysia (Kaula Lumpur) & we will provide one way Air Ticket from India - Malaysia, 1 Week Accommodation, Medical Insurance and will take care of the Visa process also. Any insights would be appreciated!

Got a question about Cognizant?

Ask anonymously on communities.

A Pyspark Developer was asked 6mo ago

Q. What tools are used to connect Google Cloud Platform (GCP) with Apache Spark?

Ans.

To connect Google Cloud Platform with Apache Spark, tools like Dataproc, Cloud Storage, and BigQuery can be used.

Use Google Cloud Dataproc to create managed Spark and Hadoop clusters on GCP.
Store data in Google Cloud Storage and access it from Spark applications.
Utilize Google BigQuery for querying and analyzing large datasets directly from Spark.

A Pyspark Developer was asked 6mo ago

Q. What are the optimization techniques used in Apache Spark?

Ans.

Optimization techniques in Apache Spark improve performance and efficiency.

Partitioning data to distribute work evenly
Caching frequently accessed data in memory
Using broadcast variables for small lookup tables
Optimizing shuffle operations by reducing data movement
Applying predicate pushdown to filter data early

A Pyspark Developer was asked 6mo ago

Q. What is the process to orchestrate code in Google Cloud Platform (GCP)?

Ans.

Orchestrating code in GCP involves using tools like Cloud Composer or Cloud Dataflow to schedule and manage workflows.

Use Cloud Composer to create, schedule, and monitor workflows using Apache Airflow
Utilize Cloud Dataflow for real-time data processing and batch processing tasks
Use Cloud Functions for event-driven serverless functions
Leverage Cloud Scheduler for job scheduling
Integrate with other GCP services like...

Are these interview questions helpful?

A Pyspark Developer was asked 6mo ago

Q. What is the difference between coalesce and repartition, as well as between cache and persist?

Ans.

Coalesce reduces the number of partitions without shuffling data, while repartition increases the number of partitions by shuffling data. Cache and persist are used to persist RDDs in memory.

Coalesce is used to reduce the number of partitions without shuffling data, while repartition is used to increase the number of partitions by shuffling data.
Coalesce is more efficient when reducing partitions as it avoids shuf...

Cognizant Pyspark Developer Interview Experiences

2 interviews found

Pyspark Developer Interview Questions & Answers

Anonymous

posted on 12 Dec 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

No response

I applied via Walk-in and was interviewed in Nov 2024. There were 3 interview rounds.

Round 1 - One-on-one

(2 Questions)

Q1. What are the optimization techniques used in Apache Spark?

Ans.

Optimization techniques in Apache Spark improve performance and efficiency.

Partitioning data to distribute work evenly
Caching frequently accessed data in memory
Using broadcast variables for small lookup tables
Optimizing shuffle operations by reducing data movement
Applying predicate pushdown to filter data early

Answered by AI

Add your answer

Q2. What is the difference between coalesce and repartition, as well as between cache and persist?

Ans.

Coalesce reduces the number of partitions without shuffling data, while repartition increases the number of partitions by shuffling data. Cache and persist are used to persist RDDs in memory.

Coalesce is used to reduce the number of partitions without shuffling data, while repartition is used to increase the number of partitions by shuffling data.
Coalesce is more efficient when reducing partitions as it avoids shuffling...

Answered by AI

Add your answer

Round 2 - One-on-one

(2 Questions)

Q1. What is the SQL query to find the second highest rank in a dataset?

Ans.

SQL query to find the second highest rank in a dataset

Use the ORDER BY clause to sort the ranks in descending order
Use the LIMIT and OFFSET clauses to skip the highest rank and retrieve the second highest rank
Example: SELECT rank FROM dataset ORDER BY rank DESC LIMIT 1 OFFSET 1

Answered by AI

Add your answer

Q2. What is the SQL code for calculating year-on-year growth percentage with year-wise grouping?

Ans.

The SQL code for calculating year-on-year growth percentage with year-wise grouping.

Use the LAG function to get the previous year's value
Calculate the growth percentage using the formula: ((current year value - previous year value) / previous year value) * 100
Group by year to get year-wise growth percentage

Answered by AI

Add your answer

Round 3 - One-on-one

(2 Questions)

Q1. What tools are used to connect Google Cloud Platform (GCP) with Apache Spark?

Ans.

To connect Google Cloud Platform with Apache Spark, tools like Dataproc, Cloud Storage, and BigQuery can be used.

Use Google Cloud Dataproc to create managed Spark and Hadoop clusters on GCP.
Store data in Google Cloud Storage and access it from Spark applications.
Utilize Google BigQuery for querying and analyzing large datasets directly from Spark.

Answered by AI

Add your answer

Q2. What is the process to orchestrate code in Google Cloud Platform (GCP)?

Ans.

Orchestrating code in GCP involves using tools like Cloud Composer or Cloud Dataflow to schedule and manage workflows.

Use Cloud Composer to create, schedule, and monitor workflows using Apache Airflow
Utilize Cloud Dataflow for real-time data processing and batch processing tasks
Use Cloud Functions for event-driven serverless functions
Leverage Cloud Scheduler for job scheduling
Integrate with other GCP services like BigQ...

Answered by AI

Add your answer

Interview Preparation Tips

Topics to prepare for Cognizant Pyspark Developer interview:

sql
spark
python
Cloud

Interview preparation tips for other job seekers - It is essential to prepare thoroughly before the interview.

Pyspark Developer Interview Questions & Answers

Anonymous

posted on 30 Dec 2024

Interview experience

Excellent

Difficulty level

Process Duration

Result

Round 1 - Technical

(2 Questions)

Q1. What is the difference between coalesce and repartition in data processing?

Ans.

Coalesce reduces the number of partitions without shuffling data, while repartition reshuffles data to create a specific number of partitions.

Coalesce is used to reduce the number of partitions without shuffling data
Repartition is used to increase or decrease the number of partitions by shuffling data
Coalesce is more efficient when reducing partitions as it avoids shuffling
Repartition is useful when you need to explici...

Answered by AI

Add your answer

Q2. What is the difference between a DataFrame and an RDD (Resilient Distributed Dataset)?

Ans.

DataFrame is a higher-level abstraction built on top of RDD, providing more structure and optimization capabilities.

DataFrames are distributed collections of data organized into named columns, similar to tables in a relational database.
RDDs are lower-level abstractions representing a collection of objects distributed across a cluster, with no inherent structure.
DataFrames provide optimizations like query optimization a...

Answered by AI

Add your answer

Interview questions from similar companies

Software Engineer Interview Questions & Answers

Capgemini

Anonymous

posted on 23 Oct 2021

I applied via Company Website and was interviewed before Oct 2020. There were 3 interview rounds.

Interview Questionnaire

1 Question

Q1. Tell me about your experience

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Be confident adn clear when you answer

Software Engineer Interview Questions & Answers

Infosys

Anonymous

posted on 5 Feb 2021

I applied via Company Website and was interviewed before Feb 2020. There was 1 interview round.

Interview Questionnaire

2 Questions

Q1. They asked about dbms questions in the form of table formate

Add your answer

Q2. They asked code for some python program

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Firstly they conducted computer based technical exam and then after qualifying that then we will go for face face interview and then lastly HR round will be held.

What people are saying about Cognizant

View All

a junior software engineer

Job offer in Malaysia - legit or scam?

Got a question about Cognizant?

Ask anonymously on communities.

Software Developer Interview Questions & Answers

Accenture

Anonymous

posted on 27 Jul 2022

I applied via LinkedIn and was interviewed before Jul 2021. There were 2 interview rounds.

Round 1 - Aptitude Test

Easy logical questions
basic quant

Round 2 - Coding Test

Easy level coding questions
Counting frequency of alphabets

Interview Preparation Tips

Interview preparation tips for other job seekers - Just go through the basics of javascript
Hoisting

Software Developer Interview Questions & Answers

Capgemini

Siddhesh Hirulkar

posted on 10 Sep 2020

Interview Questionnaire

1 Question

Q1. How to use multiple dispatch in redux?

Ans.

Multiple dispatch is not a feature of Redux. It can be achieved using middleware or custom logic.

Middleware like redux-thunk or redux-saga can be used to dispatch multiple actions based on a single action.
Custom logic can be implemented in the reducer to handle multiple actions based on a single action type.
For example, a single 'ADD_ITEM' action can trigger multiple actions like 'UPDATE_TOTAL', 'UPDATE_HISTORY', etc.
M...

Answered by AI

Add your answer

Skills evaluated in this interview

Software Developer Interview Questions & Answers

TCS

Anonymous

posted on 11 Jun 2021

I applied via Campus Placement and was interviewed before Jun 2020. There were 3 interview rounds.

Interview Questionnaire

2 Questions

Q1. Simple program

Add your answer

Q2. I wrote a simple program in C

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Be bold and confident

Are these interview questions helpful?

Software Engineer Interview Questions & Answers

Wipro

Anonymous

posted on 15 Dec 2020

I applied via Job Portal and was interviewed before Dec 2019. There was 1 interview round.

Interview Questionnaire

1 Question

Q1. First they ask basic questions like HTML SQL Java.

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - First we learn basics programming knowledge and we confident to attend interview and speak bold.

Software Engineer Interview Questions & Answers

TCS

Anonymous

posted on 24 Jun 2021

I applied via Company Website and was interviewed before Jun 2020. There was 1 interview round.

Interview Questionnaire

3 Questions

Q1. By Rajkumar Bharathi, I stay at Trichy

Add your answer

Q2. I have completed my B.E from kalasalingam university in 2020, with a score of 6.33

Add your answer

Q3. I am a fresher need this jobs

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Dress for the job or company

Software Engineer Interview Questions & Answers

TCS

Anonymous

posted on 25 Mar 2021

I applied via Campus Placement and was interviewed before Mar 2020. There were 5 interview rounds.

Interview Questionnaire

1 Question

Q1. I was placed through campus . Did not have to give the appitude / online exam as I was among top 10 students from the college . In the interview panel ,we had 3 people .one manager and two technical / staf...

View 1 more answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Be confident and try to answer all. If something is out of your scope and you don't know ,politely tell them you don't know . Keep yourself engaged with the panel . In between talk to them , just do not give long pause and stares . It make things akward . Hope it helps

Cognizant Interview FAQs

How many rounds are there in Cognizant Pyspark Developer interview?

Cognizant interview process usually has 2 rounds. The most common rounds in the Cognizant interview process are One-on-one Round and Technical.

How to prepare for Cognizant Pyspark Developer interview?

Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at Cognizant. The most common topics and skills that interviewers at Cognizant expect are Pyspark, Python, Spark, Big Data and Life.

What are the top questions asked in Cognizant Pyspark Developer interview?

Some of the top questions asked at the Cognizant Pyspark Developer interview -

What is the SQL code for calculating year-on-year growth percentage with year-w...read more
What is the difference between coalesce and repartition, as well as between cac...read more
What is the SQL query to find the second highest rank in a datas...read more

Tell us how to improve this page.

Cognizant Interviews By Designations

Interview Questions for Popular Designations

4.5/5

based on 2 interview experiences

Difficulty level

Moderate 100%

Duration

Less than 2 weeks 100%

TCS Interview Questions

3.6

• 11.1k Interviews

Accenture Interview Questions

3.8

• 8.6k Interviews

Infosys Interview Questions

3.6

• 7.9k Interviews

Wipro Interview Questions

3.7

• 6k Interviews

Capgemini Interview Questions

3.7

• 5.1k Interviews

Tech Mahindra Interview Questions

3.5

• 4.1k Interviews

HCLTech Interview Questions

3.5

• 4.1k Interviews

Genpact Interview Questions

3.8

• 3.4k Interviews

IBM Interview Questions

4.0

• 2.5k Interviews

DXC Technology Interview Questions

3.7

• 837 Interviews

View all

Cognizant Pyspark Developer Salary

based on 30 salaries

₹4.6 L/yr - ₹11 L/yr

6% less than the average Pyspark Developer Salary in India

View more details

Cognizant Salaries in India

Associate 73.1k salaries	₹5.3 L/yr - ₹12.5 L/yr
Programmer Analyst 56.2k salaries	₹3.5 L/yr - ₹7.3 L/yr
Senior Associate 52.9k salaries	₹10.5 L/yr - ₹23.5 L/yr
Senior Processing Executive 29.8k salaries	₹2.2 L/yr - ₹6.5 L/yr
Technical Lead 19k salaries	₹6 L/yr - ₹21.3 L/yr