ES|QL - Avoid retrieving unnecessary fields on node-reduce phase by carlosdelest · Pull Request #137920 · elastic/elasticsearch

carlosdelest · 2025-11-11T16:41:42Z

Avoids retrieving unnecessary fields in FieldExtractorExec on the node-reduce phase.

This PR checks that the fields retrieved are either:

Present on the TopN operator on the node-reduce phase, or
Needed on the top level Project for the node-reduce phase

This way, a query like:

FROM test METADATA _score
| WHERE knn(vector, [0, 1, 2])
| SORT _score DESC
| LIMIT 10
| KEEP id, _score

will not retrieve the vector field on the node-reduce phase and will avoid loading it completely from source.

… and original top level projection

GalLalouche

Hey @carlosdelest, thanks for taking the time to fix this! I left a couple of small comments but other LGTM. But you might one of the planning folks to take a look :)

GalLalouche · 2025-11-12T11:58:42Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/LateMaterializationPlanner.java

            return Optional.empty();
        }

+        // Calculate the expected output attributes for the data driver plan.


Nit: redundant comment (You can rename the variable to expectedDataDriverOutputAttrs if you wanted to, or extract this code to a method if you think it warrants a header.)

Unaddressed.

Did that as part of Addressed in d033d14

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/LateMaterializationPlanner.java

carlosdelest · 2025-11-12T13:51:38Z

Thanks for reviewing @GalLalouche ! I'll definitely incorporate your suggestions, add some testing, and open up to more folks. 👍

…-materialization-unnecessary-fields

…ssary-fields

elasticsearchmachine · 2025-11-21T06:33:13Z

Hi @carlosdelest, I've created a changelog YAML for you.

elasticsearchmachine · 2025-11-21T06:33:37Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2025-11-21T06:33:38Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

carlosdelest · 2025-11-21T06:34:45Z

Hi @GalLalouche ! I've addressed your comments and added some testing. Can you please give this another review?

GalLalouche · 2025-11-24T14:15:38Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/LateMaterializationPlanner.java

            return Optional.empty();
        }

+        // Calculate the expected output attributes for the data driver plan.


Unaddressed.

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/LateMaterializationPlanner.java

GalLalouche · 2025-11-24T14:17:19Z

Hi @GalLalouche ! I've addressed your comments and added some testing. Can you please give this another review?

LGTM, but I think it might warrant a look from someone from planning as well (I know @alex-spies had a lot of great comments on the original PR!).

bpintea

Do we have any tests like suggested in the issue or the PR description?

bpintea · 2025-11-25T14:56:46Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/LateMaterializationPlanner.java

-        List<Attribute> expectedDataOutput = toPhysical(topN, context).output();
-        Attribute doc = expectedDataOutput.stream().filter(EsQueryExec::isDocAttribute).findFirst().orElse(null);
+
+        List<Attribute> physicalDataOutput = toPhysical(topN, context).output();


Nit/optional: no "data" is being involved so far, just it's characterisation.

Suggested change

List<Attribute> physicalDataOutput = toPhysical(topN, context).output();

List<Attribute> physicalPlanOutput = toPhysical(topN, context).output();

or smth like "..dataNode..".

Addressed in d033d14

bpintea · 2025-11-25T15:04:18Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/LateMaterializationPlanner.java

+        AttributeSet.Builder expectedDataOutputAttrs = AttributeSet.builder();
        // We need to add the doc attribute to the project since otherwise when the fragment is converted to a physical plan for the data
        // driver, the resulting ProjectExec won't have the doc attribute in its output, which is needed by the reduce driver.
+        expectedDataOutputAttrs.add(doc);


I'd drop this line and add the isDocAttribute condition on the filter below, since it's already present into the physicalDataOutput. (This might also simply a bit the constructs here.)

Makes sense, addressed in d033d14

bpintea · 2025-11-25T15:05:58Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/LateMaterializationPlanner.java

+        AttributeSet orderRefsSet = AttributeSet.of(topN.order().stream().flatMap(o -> o.references().stream()).toList());
+        // Get the output from the physical plan below the TopN, and filter it to only the attributes needed for the final output (either
+        // because they are in the top-level Project's output, or because they are needed for ordering)
+        expectedDataOutputAttrs.addAll(
+            physicalDataOutput.stream().filter(a -> topLevelProject.outputSet().contains(a) || orderRefsSet.contains(a)).toList()
+        );
+        List<Attribute> expectedDataOutput = expectedDataOutputAttrs.build().stream().toList();


Optional: this section is very streams-heavy and while they might make the code a bit more compact(?), for the few attributes that they'll likely usually handle, I'd just iterate on lists instead.

++, ab1c94b

…-materialization-unnecessary-fields

…rialization-unnecessary-fields' into enhancement/esql-late-materialization-unnecessary-fields

carlosdelest · 2025-11-25T17:26:57Z

Do we have any tests like suggested in the issue or the PR description?

@bpintea we have tests like that:

CSV tests for the KNN function
The existing late materialization tests, with some more added

Are there any other tests you're missing?

carlosdelest · 2025-11-26T07:11:17Z

@elasticmachine run elasticsearch-ci/part-2

bpintea

Thanks Carlos.
Wrt test, I was thinking of something like adding a filter that's not projected (/projected away), like from test | where filtered > 0 | sort sorted | limit 42 | stats sum(read) and checking that in EsqlReductionLateMaterializationTestCase. If there's one like that already, all good.

alex-spies

This is great, thanks Carlos! It's essentially a specialized ProjectAwayColumns for the updated data node plan, which is nice and consistent with the planning that the coordinator does.

I agree with Bogdan: a test case or two in EsqlReductionLateMaterializationTestCase.java where only the updated data node plan requires a field, but it's correctly projected away at the end of the data node plan because the reduce plan doesn't need it - that'd be nice.

…-materialization-unnecessary-fields

carlosdelest · 2025-11-26T12:16:53Z

Got it, thanks @bpintea and @alex-spies for explaining!

I added more tests in c387814, LMKWYT!

…ssary-fields

Late materialization retrieves field data just for ordering relations…

698c705

… and original top level projection

carlosdelest added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch :Search Relevance/ES|QL Search functionality in ES|QL labels Nov 11, 2025

elasticsearchmachine added the v9.3.0 label Nov 11, 2025

GalLalouche reviewed Nov 12, 2025

View reviewed changes

elasticsearchmachine added 2 commits November 18, 2025 07:43

Merge remote-tracking branch 'origin/main' into enhancement/esql-late…

55a1ec1

…-materialization-unnecessary-fields

Code review comments

d62b4c1

carlosdelest changed the title ~~PoC - Avoid retrieving unnecessary fields on node-reduce phase~~ Nov 20, 2025

elasticsearchmachine and others added 4 commits November 20, 2025 18:51

Add integration tests

142d36b

Remove WIP test

f680062

Remove unnecessary change

2aca272

Merge branch 'main' into enhancement/esql-late-materialization-unnece…

bca4cbd

…ssary-fields

carlosdelest added the >enhancement label Nov 21, 2025

Update docs/changelog/137920.yaml

b181c68

carlosdelest marked this pull request as ready for review November 21, 2025 06:33

carlosdelest changed the title ~~Avoid retrieving unnecessary fields on node-reduce phase~~ Nov 21, 2025

carlosdelest requested a review from GalLalouche November 21, 2025 06:33

Fix changelog

d728df9

carlosdelest requested a review from a team November 24, 2025 08:19

GalLalouche approved these changes Nov 24, 2025

View reviewed changes

carlosdelest requested review from alex-spies and bpintea November 24, 2025 14:34

bpintea reviewed Nov 25, 2025

View reviewed changes

elasticsearchmachine added 5 commits November 25, 2025 18:13

PR comments

d033d14

Merge remote-tracking branch 'origin/main' into enhancement/esql-late…

3a4ea1e

…-materialization-unnecessary-fields

PR comments

5d133ef

PR comments

ab1c94b

Merge remote-tracking branch 'carlosdelest/enhancement/esql-late-mate…

ed44e71

…rialization-unnecessary-fields' into enhancement/esql-late-materialization-unnecessary-fields

carlosdelest requested a review from bpintea November 25, 2025 17:27

bpintea approved these changes Nov 26, 2025

View reviewed changes

alex-spies approved these changes Nov 26, 2025

View reviewed changes

elasticsearchmachine added 2 commits November 26, 2025 13:15

Merge remote-tracking branch 'origin/main' into enhancement/esql-late…

a8ca463

…-materialization-unnecessary-fields

Add tests for fields that are needed in other places than sort

c387814

Merge branch 'main' into enhancement/esql-late-materialization-unnece…

2a9223d

…ssary-fields

carlosdelest merged commit 0955a82 into elastic:main Nov 27, 2025
34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ES|QL - Avoid retrieving unnecessary fields on node-reduce phase#137920

ES|QL - Avoid retrieving unnecessary fields on node-reduce phase#137920
carlosdelest merged 17 commits intoelastic:mainfrom
carlosdelest:enhancement/esql-late-materialization-unnecessary-fields

carlosdelest commented Nov 11, 2025 •

edited

Loading

GalLalouche left a comment

GalLalouche Nov 12, 2025

GalLalouche Nov 24, 2025

carlosdelest Nov 25, 2025

Uh oh!

Uh oh!

Uh oh!

carlosdelest commented Nov 12, 2025

elasticsearchmachine commented Nov 21, 2025

elasticsearchmachine commented Nov 21, 2025

elasticsearchmachine commented Nov 21, 2025

carlosdelest commented Nov 21, 2025

GalLalouche Nov 24, 2025

Uh oh!

GalLalouche commented Nov 24, 2025

bpintea left a comment

bpintea Nov 25, 2025 •

edited

Loading

carlosdelest Nov 25, 2025 •

edited

Loading

bpintea Nov 25, 2025

carlosdelest Nov 25, 2025 •

edited

Loading

bpintea Nov 25, 2025

carlosdelest Nov 25, 2025 •

edited

Loading

carlosdelest commented Nov 25, 2025

carlosdelest commented Nov 26, 2025

bpintea left a comment

alex-spies left a comment

carlosdelest commented Nov 26, 2025

Uh oh!

Labels

5 participants

	List<Attribute> physicalDataOutput = toPhysical(topN, context).output();
	List<Attribute> physicalPlanOutput = toPhysical(topN, context).output();

Conversation

carlosdelest commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GalLalouche left a comment

Choose a reason for hiding this comment

GalLalouche Nov 12, 2025

Choose a reason for hiding this comment

GalLalouche Nov 24, 2025

Choose a reason for hiding this comment

carlosdelest Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

carlosdelest commented Nov 12, 2025

elasticsearchmachine commented Nov 21, 2025

elasticsearchmachine commented Nov 21, 2025

elasticsearchmachine commented Nov 21, 2025

carlosdelest commented Nov 21, 2025

GalLalouche Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

GalLalouche commented Nov 24, 2025

bpintea left a comment

Choose a reason for hiding this comment

bpintea Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

carlosdelest Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

bpintea Nov 25, 2025

Choose a reason for hiding this comment

carlosdelest Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

bpintea Nov 25, 2025

Choose a reason for hiding this comment

carlosdelest Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

carlosdelest commented Nov 25, 2025

carlosdelest commented Nov 26, 2025

bpintea left a comment

Choose a reason for hiding this comment

alex-spies left a comment

Choose a reason for hiding this comment

carlosdelest commented Nov 26, 2025

Uh oh!

Labels

5 participants

carlosdelest commented Nov 11, 2025 •

edited

Loading

bpintea Nov 25, 2025 •

edited

Loading

carlosdelest Nov 25, 2025 •

edited

Loading

carlosdelest Nov 25, 2025 •

edited

Loading

carlosdelest Nov 25, 2025 •

edited

Loading