ESQL: Calculate concurrent node limit by ivancea · Pull Request #124901 · elastic/elasticsearch

ivancea · 2025-03-14T17:17:21Z

Calculate the maximum concurrent nodes for a query, based on whether the datanode plan has a limit or not (And no other conditions/nodes before).

The concurrency limit is calculated as the log2(limit).

Also, changed the query pragma to not have an upper limit, allowing users to effectively override any calculation with a bigger limit.

# Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeComputeHandler.java

# Conflicts: # x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/ManyShardsIT.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeComputeHandler.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSender.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/QueryPragmas.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/plugin/DataNodeRequestSenderTests.java

ivancea · 2025-03-14T17:19:48Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculator.java

+     *     Used to avoid overloading the cluster with concurrent requests that may not be needed.
+     * </p>
+     *
+     * @return Null if there should be no limit, otherwise, the maximum number of nodes that should be executed concurrently.


This could be -1 for no limit, to work like the pragma

I'd probably move this to the class javadoc.

Or put a little bit there.

ivancea · 2025-03-14T17:20:52Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculator.java

+     * @return Null if there should be no limit, otherwise, the maximum number of nodes that should be executed concurrently.
+     */
+    public Integer calculateNodesConcurrency(PhysicalPlan dataNodePlan, Configuration configuration) {
+        // TODO: Request FoldContext or a context containing it


This is needed for the limit.limit().fold(...). But we can probably assert that it's a Literal, and avoid folding

ivancea · 2025-03-14T17:23:12Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculator.java

+        // TODO: Do some conversion here
+        return limit;


The logic we choose here may be quite arbitrary without some real statistics of the nodes/shard

I would probably limit to 2 for everything up to 10. Or may be something like Math.max(2, log(limit))

ivancea · 2025-03-14T17:25:49Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculator.java

+        // # Negative cases
+        // - FROM | STATS: Fragment[EsRelation, Aggregate]
+        // - SORT: Fragment[EsRelation, TopN]
+        // - WHERE: Fragment[EsRelation, Filter]


When getting the LIMIT value:

The WHERE is already taken into account explicitly.

The STATS can't have a LIMIT in the datanode side, so it's fine.

The SORT shouldn't happen, as we look for a Limit after the EsRelation, and the Limit would be a TopN otherwise.

Those are mostly assumptions; there's still a lot of testing to do with different commands that could break them

luigidellaquila

I had a first quick look and left a couple of comments.

In general, for now I think it's acceptable to have this component at this level as it's simple enough, but on the other hand this could benefit from some additional information (eg. LocalPhysicalOptimizerContext and SearchStats) that is available at physical planning time.

More in abstract, this should be part of a cost based execution planning process, but it's way too complicated as a topic for now.

luigidellaquila · 2025-03-18T13:12:17Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculator.java

+            } else if (relationFound.get() && filterFound.get() == false) {
+                // We only care about the limit if there's a relation before it, and no filter in between
+                if (node instanceof Limit limit) {
+                    assert limitValue.get() == null : "Multiple limits found in the same data node plan";


This could still happen, eg. with MV_EXPAND | LIMIT, that becomes LIMIT | MV_EXPAND | LIMIT

Removed that assertion, to just use the first limit it finds, which is what makes sense in any case

luigidellaquila · 2025-03-18T13:19:38Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculator.java

+        logicalPlan.forEachUp(node -> {
+            if (node instanceof EsRelation) {
+                relationFound.set(true);
+            } else if (node instanceof Filter) {


I'm not sure this blacklisting is safe in the long term.
I'd prefer to have a whitelist approach, ie. a set of plan types that can be present after EsRelation and that we know are safe to ignore before a LIMIT.

Initially changed it to a whitelist, but after adding test for every command, Limit is effectively pushed down always. So now it's just an "If not a relation or limit -> 💀"

… using fold()

…cy calculation

idegtiarenko · 2025-03-20T14:32:20Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculator.java

+        // 10 | 3
+        // 1000 | 9
+        // 100000 | 16
+        return Math.max(2, (int) (Math.log(limit) / Math.log(2)));


Lets not limit queries with limits higher than 1000 for now.
It might become slower when querying a lot of shards with small number of shards.

// Limit | Concurrency // 1 | 2 // 10 | 3 // 1000 | 9

Above makes sense, but I would like to confirm with @costin about it

I'm fine with this heuristic. You can always override it.

Do we get here with | LIMIT 0? Could you make sure we have tests for that?

idegtiarenko · 2025-03-20T14:42:02Z

.../esql/src/test/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculatorTests.java

+            | LOOKUP JOIN languages_lookup on language_code
+            | LIMIT 1024
+            """, 10);
+    }


I wish we have junit 5 parametrization for this: https://junit.org/junit5/docs/current/user-guide/#writing-tests-parameterized-tests-sources-CsvSource

I was looking for that when doing it, but we only have class-level parameterized tests...
I was also checking assertAll(), but, of course, junit5 too 💀

Luckily there aren't that many cases now, we can refactor them if we add more and they're similar

idegtiarenko

👍

elasticsearchmachine · 2025-03-20T16:06:40Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2025-03-20T16:06:40Z

Hi @ivancea, I've created a changelog YAML for you.

luigidellaquila

LGTM

alex-spies · 2025-03-24T09:43:27Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculator.java

+        // 10 | 3
+        // 1000 | 9
+        // 100000 | 16
+        return Math.max(2, (int) (Math.log(limit) / Math.log(2)));


Super-duper driveby: maybe it's simpler to take 31-Integer.numberOfLeadingZeros(limit) to compute the log2.

nik9000

LGTM so long as we have tests for |LIMIT 0 and we're sure this doesn't break them.

It'd be cool to see this on tests for bigger clusters. I bet it'll be compelling. I'm really curious to see about follow up that let us apply this for things like FROM | SORT | LIMIT - that's trickier but it'll be lovely one day!

nik9000 · 2025-03-24T16:24:57Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculator.java

+     *     Used to avoid overloading the cluster with concurrent requests that may not be needed.
+     * </p>
+     *
+     * @return Null if there should be no limit, otherwise, the maximum number of nodes that should be executed concurrently.


I'd probably move this to the class javadoc.

nik9000 · 2025-03-24T16:25:07Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculator.java

+     *     Used to avoid overloading the cluster with concurrent requests that may not be needed.
+     * </p>
+     *
+     * @return Null if there should be no limit, otherwise, the maximum number of nodes that should be executed concurrently.


Or put a little bit there.

nik9000 · 2025-03-24T16:26:02Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculator.java

+        // 1 | 2
+        // 10 | 3
+        // 1000 | 9
+        // 100000 | 16


This example would violate the above.

nik9000 · 2025-03-24T16:28:43Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/PlanConcurrencyCalculator.java

+        // 10 | 3
+        // 1000 | 9
+        // 100000 | 16
+        return Math.max(2, (int) (Math.log(limit) / Math.log(2)));


I'm fine with this heuristic. You can always override it.

Do we get here with | LIMIT 0? Could you make sure we have tests for that?

Manual 8.19 backport of #124901

idegtiarenko and others added 13 commits February 18, 2025 14:18

Limit concurrent node requests

eec2039

upd

d4850fc

upd

ae96763

Merge branch 'main' into limit_concurrent_node_requests

15d897f

# Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/DataNodeComputeHandler.java

do not send requests if source is completed

666f588

Merge branch 'main' into limit_concurrent_node_requests

34badf7

upd

8c55e8a

do not erase prior shard failures on skipping node

b2a66a2

Merge branch 'main' into limit_concurrent_node_requests

11bd0f0

upd

53f7d60

Initial calculator with tests

3fe1766

Fix NPE

a8faf76

ivancea requested a review from idegtiarenko March 14, 2025 17:17

elasticsearchmachine added the v9.1.0 label Mar 14, 2025

ivancea commented Mar 14, 2025

View reviewed changes

[CI] Auto commit changes from spotless

97cf29e

ivancea commented Mar 14, 2025

View reviewed changes

Fix test by optimizing plans

a5af34c

luigidellaquila reviewed Mar 18, 2025

View reviewed changes

ivancea and others added 7 commits March 19, 2025 10:41

Merge branch 'main' into esql-calculate-concurrent-requests-limit

6e41a98

Added whitelisted nodes, removed multiple Limits assertion, and avoid…

4fb514e

… using fold()

[CI] Auto commit changes from spotless

5acabda

Removed node whitelist, as limit is pushed down, and update concurren…

bc4be1d

…cy calculation

Remove limit from Pragma

36c05ee

Merge branch 'main' into esql-calculate-concurrent-requests-limit

7fa6f18

Add Nullable annotations, and cleanup

6bf6007

ivancea marked this pull request as ready for review March 20, 2025 13:17

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Mar 20, 2025

idegtiarenko reviewed Mar 20, 2025

View reviewed changes

Avoid limiting on high limits

a37ca1f

idegtiarenko reviewed Mar 20, 2025

View reviewed changes

idegtiarenko approved these changes Mar 20, 2025

View reviewed changes

ivancea added >feature Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL and removed needs:triage Requires assignment of a team area label labels Mar 20, 2025

Update docs/changelog/124901.yaml

6ea6df5

ivancea requested review from costin and luigidellaquila March 21, 2025 10:34

luigidellaquila approved these changes Mar 21, 2025

View reviewed changes

ivancea and others added 2 commits March 21, 2025 15:24

Merge branch 'main' into esql-calculate-concurrent-requests-limit

cdb007e

Fix tests after max limit logic

44e6399

alex-spies reviewed Mar 24, 2025

View reviewed changes

nik9000 approved these changes Mar 24, 2025

View reviewed changes

idegtiarenko added 2 commits March 25, 2025 12:55

fix comments

6cc0f42

Merge branch 'main' into esql-calculate-concurrent-requests-limit

9f930b6

idegtiarenko merged commit bd04d1f into elastic:main Mar 25, 2025
17 checks passed

omricohenn pushed a commit to omricohenn/elasticsearch that referenced this pull request Mar 28, 2025

ESQL: Calculate concurrent node limit (elastic#124901)

973ac30

ivancea deleted the esql-calculate-concurrent-requests-limit branch April 7, 2025 09:16

ivancea added a commit to ivancea/elasticsearch that referenced this pull request May 6, 2025

ESQL: Calculate concurrent node limit (elastic#124901)

8e2492a

ivancea mentioned this pull request May 6, 2025

[8.19] ESQL: Calculate concurrent node limit (#124901) #127745

Merged

ivancea added a commit that referenced this pull request May 6, 2025

ESQL: Calculate concurrent node limit (#124901) (#127745)

b763932

Manual 8.19 backport of #124901

nik9000 added the v8.19.0 label May 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESQL: Calculate concurrent node limit#124901

ESQL: Calculate concurrent node limit#124901
idegtiarenko merged 28 commits intoelastic:mainfrom
ivancea:esql-calculate-concurrent-requests-limit

ivancea commented Mar 14, 2025 •

edited

Loading

ivancea Mar 14, 2025

nik9000 Mar 24, 2025

nik9000 Mar 24, 2025

ivancea Mar 14, 2025

ivancea Mar 14, 2025

idegtiarenko Mar 17, 2025

ivancea Mar 14, 2025 •

edited

Loading

luigidellaquila left a comment

luigidellaquila Mar 18, 2025

ivancea Mar 20, 2025

luigidellaquila Mar 18, 2025

ivancea Mar 20, 2025

idegtiarenko Mar 20, 2025

idegtiarenko Mar 20, 2025

nik9000 Mar 24, 2025

idegtiarenko Mar 25, 2025

idegtiarenko Mar 20, 2025

ivancea Mar 20, 2025

idegtiarenko left a comment

elasticsearchmachine commented Mar 20, 2025

elasticsearchmachine commented Mar 20, 2025

luigidellaquila left a comment

alex-spies Mar 24, 2025

nik9000 left a comment

nik9000 Mar 24, 2025

nik9000 Mar 24, 2025

nik9000 Mar 24, 2025

nik9000 Mar 24, 2025

Uh oh!

Labels

6 participants

Conversation

ivancea commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ivancea Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

luigidellaquila left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

idegtiarenko left a comment

Choose a reason for hiding this comment

elasticsearchmachine commented Mar 20, 2025

elasticsearchmachine commented Mar 20, 2025

luigidellaquila left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nik9000 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Labels

6 participants

ivancea commented Mar 14, 2025 •

edited

Loading

ivancea Mar 14, 2025 •

edited

Loading