0

I am using h2o-3 java repo to load this frame but have been running into memory issues with constant GC pressure.

The actual frame size is 3.31 GB as per h2o logs, but the peak JVM usage comes to be about 60G when trying to read the actual frame.

Loaded H2O frame: Frame key: TestData.csv
   cols: 64
   rows: 16375903
 chunks: 64
   size: 3552977559
[4572.372s][info   ][gc,heap        ] GC(751) PSYoungGen: 37597471K(37605888K)->133505K(37607424K) Eden: 37463552K(37463552K)->0K(37465600K) From: 133919K(142336K)->133505K(141824K)
[4572.372s][info   ][gc,heap        ] GC(751) ParOldGen: 37023812K(75497472K)->37023812K(75497472K)
[4572.372s][info   ][gc,metaspace   ] GC(751) Metaspace: 149267K(150464K)->149267K(150464K) NonClass: 128211K(128896K)->128211K(128896K) Class: 21056K(21568K)->21056K(21568K)
[4572.372s][info   ][gc             ] GC(751) Pause Young (Allocation Failure) 72872M->36286M(110454M) 34.508ms
[4572.372s][info   ][gc,cpu         ] GC(751) User=0.41s Sys=0.01s Real=0.04s
[4573.512s][info   ][gc,start       ] GC(752) Pause Young (Allocation Failure)
[4573.547s][info   ][gc,heap        ] GC(752) PSYoungGen: 37599105K(37607424K)->134243K(37606912K) Eden: 37465600K(37465600K)->0K(37465600K) From: 133505K(141824K)->134243K(141312K)
[4573.547s][info   ][gc,heap        ] GC(752) ParOldGen: 37023812K(75497472K)->37025404K(75497472K)
[4573.547s][info   ][gc,metaspace   ] GC(752) Metaspace: 149267K(150464K)->149267K(150464K) NonClass: 128211K(128896K)->128211K(128896K) Class: 21056K(21568K)->21056K(21568K)
[4573.547s][info   ][gc             ] GC(752) Pause Young (Allocation Failure) 72873M->36288M(110453M) 34.940ms
[4573.547s][info   ][gc,cpu         ] GC(752) User=0.42s Sys=0.01s Real=0.04s
[4574.694s][info   ][gc,start       ] GC(753) Pause Young (Allocation Failure)
[4574.730s][info   ][gc,heap        ] GC(753) PSYoungGen: 37599843K(37606912K)->140646K(37603840K) Eden: 37465600K(37465600K)->0K(37463040K) From: 134243K(141312K)->140646K(140800K)
[4574.730s][info   ][gc,heap        ] GC(753) ParOldGen: 37025404K(75497472K)->37030581K(75497472K)
[4574.730s][info   ][gc,metaspace   ] GC(753) Metaspace: 149268K(150464K)->149268K(150464K) NonClass: 128211K(128896K)->128211K(128896K) Class: 21056K(21568K)->21056K(21568K)
[4574.730s][info   ][gc             ] GC(753) Pause Young (Allocation Failure) 72876M->36300M(110450M) 36.442ms
[4574.730s][info   ][gc,cpu         ] GC(753) User=0.45s Sys=0.00s Real=0.03s
[4575.808s][info   ][gc,start       ] GC(754) Pause Young (Allocation Failure)
[4575.844s][info   ][gc,heap        ] GC(754) PSYoungGen: 37603686K(37603840K)->135682K(37605888K) Eden: 37463040K(37463040K)->0K(37463040K) From: 140646K(140800K)->135682K(142848K)
[4575.844s][info   ][gc,heap        ] GC(754) ParOldGen: 37030581K(75497472K)->37038668K(75497472K)
[4575.844s][info   ][gc,metaspace   ] GC(754) Metaspace: 149268K(150464K)->149268K(150464K) NonClass: 128211K(128896K)->128211K(128896K) Class: 21056K(21568K)->21056K(21568K)
[4575.844s][info   ][gc             ] GC(754) Pause Young (Allocation Failure) 72885M->36303M(110452M) 36.014ms
[4575.844s][info   ][gc,cpu         ] GC(754) User=0.44s Sys=0.01s Real=0.03s
[4576.901s][info   ][gc,start       ] GC(755) Pause Young (Allocation Failure)
[4576.936s][info   ][gc,heap        ] GC(755) PSYoungGen: 37598450K(37605888K)->134747K(37606912K) Eden: 37462768K(37463040K)->0K(37464576K) From: 135682K(142848K)->134747K(142336K)
[4576.936s][info   ][gc,heap        ] GC(755) ParOldGen: 37038668K(75497472K)->37040005K(75497472K)
[4576.936s][info   ][gc,metaspace   ] GC(755) Metaspace: 149268K(150464K)->149268K(150464K) NonClass: 128211K(128896K)->128211K(128896K) Class: 21056K(21568K)->21056K(21568K)
[4576.936s][info   ][gc             ] GC(755) Pause Young (Allocation Failure) 72887M->36303M(110453M) 34.523ms
[4576.936s][info   ][gc,cpu         ] GC(755) User=0.42s Sys=0.00s Real=0.04s
[4577.987s][info   ][gc,start       ] GC(756) Pause Young (Allocation Failure)
[4578.021s][info   ][gc,heap        ] GC(756) PSYoungGen: 37599323K(37606912K)->135186K(37606400K) Eden: 37464576K(37464576K)->0K(37464576K) From: 134747K(142336K)->135186K(141824K)
[4578.021s][info   ][gc,heap        ] GC(756) ParOldGen: 37040005K(75497472K)->37040005K(75497472K)
[4578.021s][info   ][gc,metaspace   ] GC(756) Metaspace: 149268K(150464K)->149268K(150464K) NonClass: 128211K(128896K)->128211K(128896K) Class: 21056K(21568K)->21056K(21568K)
[4578.021s][info   ][gc             ] GC(756) Pause Young (Allocation Failure) 72889M->36303M(110453M) 34.372ms
[4578.021s][info   ][gc,cpu         ] GC(756) User=0.42s Sys=0.01s Real=0.03s
[4579.074s][info   ][gc,start       ] GC(757) Pause Young (Allocation Failure)
[4579.109s][info   ][gc,heap        ] GC(757) PSYoungGen: 37599762K(37606400K)->136628K(37607424K) Eden: 37464576K(37464576K)->0K(37466112K) From: 135186K(141824K)->136628K(141312K)
[4579.109s][info   ][gc,heap        ] GC(757) ParOldGen: 37040005K(75497472K)->37040005K(75497472K)
[4579.109s][info   ][gc,metaspace   ] GC(757) Metaspace: 149268K(150464K)->149268K(150464K) NonClass: 128211K(128896K)->128211K(128896K) Class: 21056K(21568K)->21056K(21568K)

Given the above logs, what configurations I can use to relieve the memory pressure from the jvm? I am executing and trying to parse this large data frame in h2o on EMR Serverless with 108G memory and 16cores on executor, but still running into memory issues while parsing this large single file.

1
  • Welcome to Stack Overflow! Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See How to Ask for help clarifying this question. Include the desired behavior, a specific problem or error and the shortest code necessary to reproduce the issue. edit your question to add a minimal reproducible example. Commented Sep 29 at 19:51

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.