What is the difference between coalesce and repartition, as well as between cache and persist?

AnswerBot
6mo
Coalesce reduces the number of partitions without shuffling data, while repartition increases the number of partitions by shuffling data. Cache and persist are used to persist RDDs in memory.
Coalesce ...read more
Help your peers!
Add answer anonymously...
Cognizant Pyspark Developer interview questions & answers
A Pyspark Developer was asked 5mo agoQ. What is the difference between coalesce and repartition in data processing?
A Pyspark Developer was asked 5mo agoQ. What is the difference between a DataFrame and an RDD (Resilient Distributed Dat...read more
A Pyspark Developer was asked 6mo agoQ. What is the SQL code for calculating year-on-year growth percentage with year-wi...read more
Popular interview questions of Pyspark Developer
A Pyspark Developer was asked 5mo agoQ1. What is the difference between coalesce and repartition in data processing?
A Pyspark Developer was asked 5mo agoQ2. What is the difference between a DataFrame and an RDD (Resilient Distributed Dat...read more
A Pyspark Developer was asked 6mo agoQ3. What is the SQL code for calculating year-on-year growth percentage with year-wi...read more
Stay ahead in your career. Get AmbitionBox app


Trusted by over 1.5 Crore job seekers to find their right fit company
80 L+
Reviews
10L+
Interviews
4 Cr+
Salaries
1.5 Cr+
Users
Contribute to help millions
AmbitionBox Awards
Get AmbitionBox app

