spark 3 is dropping cached fractions from memory

Ask Question

Asked 22 days ago

Modified 22 days ago

Viewed 54 times

-1

Im using Spark3 and Im observing that the cached partitions are getting dropped from memory. This is what Im doing:

caching a df
applying a filter on the cached df, assigning the result into a new df and caching it

df1=spark.table("table1").cache() - fraction cached reaches 100% and then drops down df2=df1.filter(col("id).isin(1,2,3,4)).cache()

Have sufficient memory for execution and storage. No spills observed No evictions observed But still after the cached fractions reach 98-100%, start dropping And slowly the percentage picks up as the second df is being cached. It seems like it is recomputing the cache of the first df again.

Why are the cached partitions getting dropped? Why is it recomputing again? Any ideas?

asked Oct 12 at 19:57

dev

91 bronze badge

Add a comment |

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

spark 3 is dropping cached fractions from memory

0

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.