ESQL: Track memory in evaluators by nik9000 · Pull Request #133392 · elastic/elasticsearch

nik9000 · 2025-08-22T13:09:43Z

If you write very very large ESQL queries you can spend a lot of memory on the expression evaluators themselves. You can certainly do it in real life, but our tests do something like:

FROM foo
| EVAL a0001 = n + 1
| EVAL a0002 = a0001 + 1
| EVAL a0003 = a0002 + 1
...
| EVAL a5000 = a4999 + 1
| STATS MAX(a5000)

Each evaluator costs like 200 bytes a pop. For thousands of evaluators this adds up. So! We have to track it. This prevents OOMs in these semi-degenerate cases, instead throwing a CircuitBreakerException.

Nhat had suggested charging a flat 200 bytes a pop. I thought about it and decided that it'd be pretty easy to get the actual size. Most of the evaluators are generated and it's a fairly small generated change to pick that up. So I did.

We do build the evaluators before we cost them, but that's fine because they are very very small. So long as we account for them, I think it's safe.

If you write very very large ESQL queries you can spend a lot of memory on the expression evaluators themselves. You can certainly do it in real life, but our tests do something like: ``` FROM foo | EVAL a0001 = n + 1 | EVAL a0002 = a0001 + 1 | EVAL a0003 = a0002 + 1 ... | EVAL a5000 = a4999 + 1 | STATS MAX(a5000) ``` Each evaluator costs like 200 bytes a pop. For thousands of evaluators this adds up. So! We have to track it. Nhat had suggested charging a flat 200 bytes a pop. I thought about it and decided that it'd be pretty easy to get the actual size. Most of the evaluators are generated and it's a fairly small generated change to pick that up. So I did. We *do* build the evaluators before we cost them, but that's fine because they are very very small. So long as we account for them, I think it's safe.

elasticsearchmachine · 2025-08-22T13:10:07Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2025-08-22T13:10:08Z

Hi @nik9000, I've created a changelog YAML for you.

…l_track_evaluators

nik9000 · 2025-08-22T13:29:30Z

If you are reading this for the first time, start here: https://github.com/elastic/elasticsearch/pull/133392/files#diff-411bc9dd7ffd062b664e3e2dc83482512d03a14cf15cb281f0f1897276fbffe5R68

dnhatn

Thank you for fixing this. I see you labeled this as a bug; should we also backport it to 9.1 and 8.19?

nik9000 · 2025-08-22T16:14:50Z

I see you labeled this as a bug; should we also backport it to 9.1 and 8.19?

Probably. I think I can do that.

…l_track_evaluators

nik9000 · 2025-08-26T15:54:05Z

Thanks friends!

nik9000 · 2025-08-26T15:54:46Z

I'll backport this by hand.

If you write very very large ESQL queries you can spend a lot of memory on the expression evaluators themselves. You can certainly do it in real life, but our tests do something like: ``` FROM foo | EVAL a0001 = n + 1 | EVAL a0002 = a0001 + 1 | EVAL a0003 = a0002 + 1 ... | EVAL a5000 = a4999 + 1 | STATS MAX(a5000) ``` Each evaluator costs like 200 bytes a pop. For thousands of evaluators this adds up. So! We have to track it. Nhat had suggested charging a flat 200 bytes a pop. I thought about it and decided that it'd be pretty easy to get the actual size. Most of the evaluators are generated and it's a fairly small generated change to pick that up. So I did. We *do* build the evaluators before we cost them, but that's fine because they are very very small. So long as we account for them, I think it's safe.

When we compile the code for `CONTAINS` we generate an evaluator java class and commit that, as is our ancient custom. But because elastic#133016 didn't see elastic#133392, we committed out of date code. That's fine because we regenerate the code on every compile. But it's annoying because every clone is out of date. This updates the generated file. You may be asking "why do you commit the generated code if you just generate it at compile time?" That's a good question! It's a grand tradition, one that we will probably one day leave behind. But let's celebrate it today by committing more code.

When we compile the code for `CONTAINS` we generate an evaluator java class and commit that, as is our ancient custom. But because #133016 didn't see #133392, we committed out of date code. That's fine because we regenerate the code on every compile. But it's annoying because every clone is out of date. This updates the generated file. You may be asking "why do you commit the generated code if you just generate it at compile time?" That's a good question! It's a grand tradition, one that we will probably one day leave behind. But let's celebrate it today by committing more code.

ESQL: Track memory in evaluators (elastic#133392) got merged to main at the same as Add MV_CONTAINS function elastic#133099 which caused a compile-error and the merge was reverted. This commit addresses the compile-error.

If you write very very large ESQL queries you can spend a lot of memory on the expression evaluators themselves. You can certainly do it in real life, but our tests do something like: ``` FROM foo | EVAL a0001 = n + 1 | EVAL a0002 = a0001 + 1 | EVAL a0003 = a0002 + 1 ... | EVAL a5000 = a4999 + 1 | STATS MAX(a5000) ``` Each evaluator costs like 200 bytes a pop. For thousands of evaluators this adds up. So! We have to track it. Nhat had suggested charging a flat 200 bytes a pop. I thought about it and decided that it'd be pretty easy to get the actual size. Most of the evaluators are generated and it's a fairly small generated change to pick that up. So I did. We *do* build the evaluators before we cost them, but that's fine because they are very very small. So long as we account for them, I think it's safe.

nik9000 requested a review from dnhatn August 22, 2025 13:09

nik9000 added >bug :Analytics/ES|QL AKA ESQL v9.2.0 labels Aug 22, 2025

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 22, 2025

Update docs/changelog/133392.yaml

5d8ed41

nik9000 added 2 commits August 22, 2025 09:10

Remove extra

8345edc

Merge remote-tracking branch 'nik9000/esql_track_evaluators' into esq…

1ec34c7

…l_track_evaluators

dnhatn approved these changes Aug 22, 2025

View reviewed changes

nik9000 added v9.1.0 v8.19.3 labels Aug 22, 2025

nik9000 and others added 6 commits August 22, 2025 12:28

Merge branch 'main' into esql_track_evaluators

3906ac7

Fix

b6faf68

[CI] Auto commit changes from spotless

6de8e82

Format!

19d2815

Merge remote-tracking branch 'nik9000/esql_track_evaluators' into esq…

a1b2639

…l_track_evaluators

Merge branch 'main' into esql_track_evaluators

af207f4

ivancea approved these changes Aug 25, 2025

View reviewed changes

elasticsearchmachine added v8.19.4 and removed v8.19.3 labels Aug 26, 2025

nik9000 merged commit f32c348 into elastic:main Aug 26, 2025
33 checks passed

nik9000 mentioned this pull request Aug 26, 2025

ESQL: Update generated code #133594

Merged

mjmbischoff mentioned this pull request Aug 27, 2025

Esql mv_contains function #133636

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESQL: Track memory in evaluators#133392

ESQL: Track memory in evaluators#133392
nik9000 merged 10 commits intoelastic:mainfrom
nik9000:esql_track_evaluators

nik9000 commented Aug 22, 2025

elasticsearchmachine commented Aug 22, 2025

elasticsearchmachine commented Aug 22, 2025

nik9000 commented Aug 22, 2025

dnhatn left a comment

nik9000 commented Aug 22, 2025

Uh oh!

nik9000 commented Aug 26, 2025

nik9000 commented Aug 26, 2025

Labels

4 participants

Conversation

nik9000 commented Aug 22, 2025

elasticsearchmachine commented Aug 22, 2025

elasticsearchmachine commented Aug 22, 2025

nik9000 commented Aug 22, 2025

dnhatn left a comment

Choose a reason for hiding this comment

nik9000 commented Aug 22, 2025

Uh oh!

nik9000 commented Aug 26, 2025

nik9000 commented Aug 26, 2025

Labels

4 participants