What is the difference between coalesce and repartition, as well as between cache and persist?

AnswerBot
6mo

Coalesce reduces the number of partitions without shuffling data, while repartition increases the number of partitions by shuffling data. Cache and persist are used to persist RDDs in memory.

  • Coalesce ...read more

Help your peers!
Select
Add answer anonymously...

Cognizant Pyspark Developer interview questions & answers

A Pyspark Developer was asked 5mo agoQ. What is the difference between coalesce and repartition in data processing?
A Pyspark Developer was asked 5mo agoQ. What is the difference between a DataFrame and an RDD (Resilient Distributed Dat...read more
A Pyspark Developer was asked 6mo agoQ. What is the SQL code for calculating year-on-year growth percentage with year-wi...read more

Popular interview questions of Pyspark Developer

A Pyspark Developer was asked 5mo agoQ1. What is the difference between coalesce and repartition in data processing?
A Pyspark Developer was asked 5mo agoQ2. What is the difference between a DataFrame and an RDD (Resilient Distributed Dat...read more
A Pyspark Developer was asked 6mo agoQ3. What is the SQL code for calculating year-on-year growth percentage with year-wi...read more
Cognizant Pyspark Developer Interview Questions
Stay ahead in your career. Get AmbitionBox app
play-icon
play-icon
qr-code
Trusted by over 1.5 Crore job seekers to find their right fit company
80 L+

Reviews

10L+

Interviews

4 Cr+

Salaries

1.5 Cr+

Users

Contribute to help millions

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2025 Info Edge (India) Ltd.

Follow Us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter
Profile Image
Hello, Guest
AmbitionBox Employee Choice Awards 2025
Winners announced!
awards-icon
Contribute to help millions!
Write a review
Write a review
Share interview
Share interview
Contribute salary
Contribute salary
Add office photos
Add office photos
Add office benefits
Add office benefits