Skip to content

ESQL: Reserve memory TopN (#134235)#134331

Merged
nik9000 merged 2 commits intoelastic:8.19from
nik9000:esql_reserve_memory_for_lucene_topn_8_19
Sep 9, 2025
Merged

ESQL: Reserve memory TopN (#134235)#134331
nik9000 merged 2 commits intoelastic:8.19from
nik9000:esql_reserve_memory_for_lucene_topn_8_19

Conversation

@nik9000
Copy link
Member

@nik9000 nik9000 commented Sep 8, 2025

Tracks the more memory that's involved in topn.

Lucene doesn't track memory usage for TopN and can use a fair bit of it. Try this query:

FROM big_table
| SORT a, b, c, d, e
| LIMIT 1000000
| STATS MAX(a)

We attempt to return all million documents from lucene. Is we did this with the compute engine we're track all of the memory usage. With lucene we have to reserve it.

In the case of the query above the sort keys weight 8 bytes each. 40 bytes total. Plus another 72 for Lucene's FieldDoc. And another 40 at least for copying to the values to FieldDoc. That totals something like 152 bytes a piece. That's 145mb. Worth tracking!

Esql Engine TopN

Esql does track memory for topn, but it doesn't track the memory used by the min heap itself. It's just a big array of pointers. But it can get very big!

Tracks the more memory that's involved in topn.

Lucene doesn't track memory usage for TopN and can use a fair bit of it.
Try this query:
```
FROM big_table
| SORT a, b, c, d, e
| LIMIT 1000000
| STATS MAX(a)
```

We attempt to return all million documents from lucene. Is we did this
with the compute engine we're track all of the memory usage. With lucene
we have to reserve it.

In the case of the query above the sort keys weight 8 bytes each. 40
bytes total. Plus another 72 for Lucene's `FieldDoc`. And another 40 at
least for copying to the values to `FieldDoc`. That totals something
like 152 bytes a piece. That's 145mb. Worth tracking!

 ## Esql Engine TopN

Esql *does* track memory for topn, but it doesn't track the memory used by the min heap itself. It's just a big array of pointers. But it can get very big!
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Sep 8, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@nik9000 nik9000 disabled auto-merge September 9, 2025 17:00
@nik9000 nik9000 merged commit 5bb46e6 into elastic:8.19 Sep 9, 2025
23 checks passed
sarog pushed a commit to portsbuild/elasticsearch that referenced this pull request Sep 11, 2025
Tracks the more memory that's involved in topn.

Lucene doesn't track memory usage for TopN and can use a fair bit of it.
Try this query:
```
FROM big_table
| SORT a, b, c, d, e
| LIMIT 1000000
| STATS MAX(a)
```

We attempt to return all million documents from lucene. Is we did this
with the compute engine we're track all of the memory usage. With lucene
we have to reserve it.

In the case of the query above the sort keys weight 8 bytes each. 40
bytes total. Plus another 72 for Lucene's `FieldDoc`. And another 40 at
least for copying to the values to `FieldDoc`. That totals something
like 152 bytes a piece. That's 145mb. Worth tracking!

 ## Esql Engine TopN

Esql *does* track memory for topn, but it doesn't track the memory used by the min heap itself. It's just a big array of pointers. But it can get very big!
sarog pushed a commit to portsbuild/elasticsearch that referenced this pull request Sep 19, 2025
Tracks the more memory that's involved in topn.

Lucene doesn't track memory usage for TopN and can use a fair bit of it.
Try this query:
```
FROM big_table
| SORT a, b, c, d, e
| LIMIT 1000000
| STATS MAX(a)
```

We attempt to return all million documents from lucene. Is we did this
with the compute engine we're track all of the memory usage. With lucene
we have to reserve it.

In the case of the query above the sort keys weight 8 bytes each. 40
bytes total. Plus another 72 for Lucene's `FieldDoc`. And another 40 at
least for copying to the values to `FieldDoc`. That totals something
like 152 bytes a piece. That's 145mb. Worth tracking!

 ## Esql Engine TopN

Esql *does* track memory for topn, but it doesn't track the memory used by the min heap itself. It's just a big array of pointers. But it can get very big!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL backport >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.4

2 participants