ESQL: GROUP BY ALL by leontyevdv · Pull Request #137367 · elastic/elasticsearch

leontyevdv · 2025-10-30T10:44:53Z

The final goal of the task #136253 is to output a list of dimensions for the time-series group.

We split this work into the three steps:

Support bare time-series aggregation functions and output the value of _tsid in the new _timeseries column:
Example:

TS k8s
| STATS avg = avg_over_time(network.cost) BY tbucket = TBUCKET(1 hour)

avg:double         | tbucket:datetime         | _timeseries:keyword
7.262931034482759  | 2024-05-10T00:00:00.000Z | KBGrBhmEnziumfgfOq1dn9NyLSzHbdLj0kfy_m-tXh-yxVR6b3E17ss
6.870689655172414  | 2024-05-10T00:00:00.000Z | KBGrBhmEnziumfgfOq1dn9NyLSzHck3A2MQiFrxrlvTQFP7YxbJ0rYg
6.635416666666667  | 2024-05-10T00:00:00.000Z | KBGrBhmEnziumfgfOq1dn9NyLSzHmDfU38IhLHmDYPRNCcqXvvDolM4

Load a list of dimensions and output them in the _timeseries column instead of _tsid (here is a doc about it).
Make the optimization to load a doc per _tsid.

This PR adds support for bare time-series aggregate functions and the _timeseries column into the output (step 1).

Part of #136253

Conflicts: x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec

Conflicts: x-pack/plugin/esql/qa/testFixtures/src/main/java/org/elasticsearch/xpack/esql/LoadMapping.java x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec

…-all-over-time' into esql-group-by-all-over-time

# Conflicts: # server/src/main/resources/transport/upper_bounds/8.18.csv # server/src/main/resources/transport/upper_bounds/8.19.csv # server/src/main/resources/transport/upper_bounds/9.0.csv # server/src/main/resources/transport/upper_bounds/9.1.csv # server/src/main/resources/transport/upper_bounds/9.2.csv # server/src/main/resources/transport/upper_bounds/9.3.csv # x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/TranslateTimeSeriesAggregate.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Drop.java

Cleanup Part of elastic#136253

Ignore GroupByAll in GenerativeForkRestTest Part of elastic#136253

elasticsearchmachine · 2025-11-19T17:23:23Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2025-11-19T17:23:23Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

elasticsearchmachine · 2025-11-19T17:30:50Z

Hi @leontyevdv, I've updated the changelog YAML for you.

dnhatn

The approach looks good. Thanks Dima! I played with the PR and think we can simplify the translation and dimension loading. Can you take a look at this patch to see if I missed anything?

dnhatn · 2025-11-06T00:53:54Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Drop.java

+ * DROP commands are parsed into Drop objects, which the {@link org.elasticsearch.xpack.esql.analysis.Analyzer.ResolveRefs} rule then
+ * rewrites into {@link org.elasticsearch.xpack.esql.plan.logical.local.EsqlProject} plans, along with other projection-like commands.
+ * As such, Drop is neither serializable nor able to be mapped to a corresponding physical plan.
+ */


Can we revert this change? Adding Javadoc is great, but it should be done in a separate change.

dnhatn · 2025-11-19T18:13:19Z

.../src/main/java/org/elasticsearch/xpack/esql/optimizer/PostOptimizationPhasePlanVerifier.java

-            boolean ignoreError = hasProjectAwayColumns || hasLookupJoinExec || hasTextGroupingInTimeSeries;
+
+            // TranslateTimeSeriesAggregate may add a _timeseries attribute into the projection
+            boolean hasTimeseriesReplacingTsid = expectedOutputAttributes.stream()


Can we check TimeSeriesAggregate only? This could slow down simple queries like FROM x when the output has many fields.

dnhatn · 2025-11-19T18:18:43Z

.../src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/ToString.java

        Map.entry(GEOHEX, (source, fieldEval) -> new ToStringFromGeoGridEvaluator.Factory(source, fieldEval, GEOHEX)),
-        Map.entry(AGGREGATE_METRIC_DOUBLE, ToStringFromAggregateMetricDoubleEvaluator.Factory::new)
+        Map.entry(AGGREGATE_METRIC_DOUBLE, ToStringFromAggregateMetricDoubleEvaluator.Factory::new),
+        Map.entry(TSID_DATA_TYPE, ToStringFromTsidEvaluator.Factory::new)


Should we support tsid in Base64 for and surrogate this to Base64 instead?

Yes, this will be better. I reverted toString and adjusted Base64 to support _tsid. Thank you!

dnhatn · 2025-11-19T18:34:46Z

...ck/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/TimeSeriesGroupByAll.java

+
+    public LogicalPlan rule(TimeSeriesAggregate aggregate) {
+        boolean hasTopLevelOverTimeAggs = false;
+        List<NamedExpression> newAggregateFunctions = new ArrayList<>();


nit: init with the size form aggregate.aggregates()

dnhatn · 2025-11-19T18:54:06Z

...ck/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/TimeSeriesGroupByAll.java

+                    newAggregateFunctions.add(new Alias(alias.source(), alias.name(), new Values(tsAgg.source(), tsAgg)));
+                } else {
+                    // Preserve non-time-series aggregates
+                    newAggregateFunctions.add(agg);


I think we should fail if we have mixed aggregates: with and without outer aggregation

I added a check for this below. If there are mixed aggs, we throw an exception now.

Remove Drop JavaDoc Part of elastic#136253

Address the PR's comments Part of elastic#136253

# Conflicts: # x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec

Address the PR's comments Part of elastic#136253

leontyevdv · 2025-11-21T11:48:36Z

.../src/main/java/org/elasticsearch/xpack/esql/optimizer/PostOptimizationPhasePlanVerifier.java

            );
-            boolean ignoreError = hasProjectAwayColumns || hasLookupJoinExec || hasTextGroupingInTimeSeries;
+
+            // TranslateTimeSeriesAggregate may add a _timeseries attribute into the projection


We need this because in the LogicalPlanOptimizer.optimize() line 123 we take an expected output from the verified plan and the _timeseries attribute is not there. It appears in the optimized plan.

# Conflicts: # x-pack/plugin/esql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/generative/GenerativeForkRestTest.java

dnhatn

Looks great. Thanks Dima!

x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec

dnhatn · 2025-11-22T19:28:32Z

...ck/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/TimeSeriesGroupByAll.java

+        );
+        List<Expression> groupings = new ArrayList<>();
+        groupings.add(timeSeries);
+        groupings.addAll(aggregate.groupings());


@kkrik-es Should we allow or reject TS .. | STATS rate(x) BY cluster. I am okay with either option.

My 2c is that this is a bit confusing. It suggests that we're just grouping by cluster but there's no outer aggregation for it. I understand that it groups by all and then also adds the cluster column but I don't think that's self-evident.

Maybe it's something we should leave out for now, it's something we can easily add later as well. Maybe it makes more sense in combination with an explicit BY DIMENSIONS(), cluster. But I'm skeptical if it's an important use case to support both an implicit and explicit grouping at the same time.

Right, I'd rather keep the semantics simple and reject this. We either provide no grouping attribute, and group by all, or provide one or more grouping attributes along with a reduction function (outer agg).

I added a check that rejects such cases. It allows to use GroupingFunction (Bucket, TBucket, Categorize) but rejects attributes. This applies to the time-series aggregates only.

Address the PR's comments Part of elastic#136253

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/VerifierTests.java

Remove support of the grouping attributes for the bare time-series aggregates. Part of elastic#136253

Fix merge. Part of elastic#136253

Adds support for bare time-series aggregate functions and a _timeseries column into the output. The _timeseries column contain a BASE64 representation of a _tsid value. Part of elastic#136253 --------- Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>

not-napoleon and others added 30 commits August 27, 2025 10:51

add capability and initial CSV test (WIP)

5b93ed9

tests (that don't pass)

7e15b58

Merge branch 'main' into esql-group-by-all-over-time

6ded62e

tests and tracing notes

90e8144

Merge branch 'main' into esql-group-by-all-over-time

a49979c

added rate agg to tests

60d8343

add test gates

611058f

detect when group by all should kick in

7504f02

add outline for the rest of the steps

9517ba3

why does this throw?

a953df9

load dimension and metric data in the tests

317da80

build the values aggs for the dimensions

b61fedb

work around output changed verification

ffd140d

notes for future me, and others

7733326

draft of analyzer rule

2ddc444

wire up the rule

9867913

many tests passing

f8f523e

Merge branch 'main' into esql-group-by-all-over-time

0a2104b

Conflicts: x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec

failing for a new reason

0719784

[CI] Auto commit changes from spotless

80247dc

misc cleanups

0dea349

Merge branch 'main' into esql-group-by-all-over-time

77213a4

Conflicts: x-pack/plugin/esql/qa/testFixtures/src/main/java/org/elasticsearch/xpack/esql/LoadMapping.java x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec

more testing

aabcf4c

Merge remote-tracking branch 'refs/remotes/not-napoleon/esql-group-by…

1240a21

…-all-over-time' into esql-group-by-all-over-time

forgot to add this test yesterday

09b8df8

[CI] Auto commit changes from spotless

bd38eea

[CI] Update transport version definitions

22e0973

Merge branch 'main' into esql-group-by-all-over-time

97c7f60

Fix unit tests

5e5e915

leontyevdv added 2 commits November 19, 2025 17:30

ESQL: Add GROUP BY ALL

022e6f2

Cleanup Part of elastic#136253

ESQL: Add GROUP BY ALL

f536969

Ignore GroupByAll in GenerativeForkRestTest Part of elastic#136253

leontyevdv marked this pull request as ready for review November 19, 2025 17:22

Update docs/changelog/137367.yaml

93f9169

dnhatn reviewed Nov 20, 2025

View reviewed changes

dnhatn self-requested a review November 20, 2025 00:17

leontyevdv added 7 commits November 20, 2025 10:59

ESQL: Add GROUP BY ALL

4ce2242

Remove Drop JavaDoc Part of elastic#136253

ESQL: Add GROUP BY ALL

f1f62e3

Address the PR's comments Part of elastic#136253

ESQL: Add GROUP BY ALL

4507c6a

Address the PR's comments Part of elastic#136253

Merge branch 'main' into feature/esql-group-by-all

f426f12

# Conflicts: # x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec

ESQL: Add GROUP BY ALL

c41f05b

Address the PR's comments Part of elastic#136253

ESQL: Add GROUP BY ALL

27848d9

Address the PR's comments Part of elastic#136253

ESQL: Add GROUP BY ALL

1e866d9

Address the PR's comments Part of elastic#136253

leontyevdv commented Nov 21, 2025

View reviewed changes

Merge branch 'main' into feature/esql-group-by-all

4ff067d

# Conflicts: # x-pack/plugin/esql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/generative/GenerativeForkRestTest.java

dnhatn approved these changes Nov 22, 2025

View reviewed changes

leontyevdv added 2 commits November 24, 2025 09:56

Merge branch 'main' into feature/esql-group-by-all

20952e5

ESQL: Add GROUP BY ALL

57f98f6

Address the PR's comments Part of elastic#136253

kkrik-es reviewed Nov 24, 2025

View reviewed changes

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/VerifierTests.java Show resolved Hide resolved

leontyevdv added 4 commits November 24, 2025 16:24

ESQL: Add GROUP BY ALL

bbddf8a

Remove support of the grouping attributes for the bare time-series aggregates. Part of elastic#136253

Merge branch 'main' into feature/esql-group-by-all

5f8b1dc

ESQL: Add GROUP BY ALL

deeee48

Fix merge. Part of elastic#136253

Merge branch 'main' into feature/esql-group-by-all

520e12b

leontyevdv merged commit ef8a008 into elastic:main Nov 25, 2025
34 checks passed

leontyevdv mentioned this pull request Nov 25, 2025

ESQL: GROUP BY ALL with the dimensions output #138595

Merged

not-napoleon mentioned this pull request Dec 18, 2025

Esql Group top level over time aggs by all dimensions #134397

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESQL: GROUP BY ALL#137367

ESQL: GROUP BY ALL#137367
leontyevdv merged 66 commits intoelastic:mainfrom
leontyevdv:feature/esql-group-by-all

leontyevdv commented Oct 30, 2025 •

edited

Loading

elasticsearchmachine commented Nov 19, 2025

elasticsearchmachine commented Nov 19, 2025

elasticsearchmachine commented Nov 19, 2025

dnhatn left a comment •

edited

Loading

dnhatn Nov 6, 2025

dnhatn Nov 19, 2025

leontyevdv Nov 20, 2025

dnhatn Nov 19, 2025

leontyevdv Nov 20, 2025

dnhatn Nov 19, 2025

dnhatn Nov 19, 2025

leontyevdv Nov 21, 2025

leontyevdv Nov 21, 2025 •

edited

Loading

dnhatn left a comment

Uh oh!

dnhatn Nov 22, 2025

felixbarny Nov 24, 2025

kkrik-es Nov 24, 2025

leontyevdv Nov 24, 2025

Uh oh!

Uh oh!

Labels

6 participants

Conversation

leontyevdv commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

elasticsearchmachine commented Nov 19, 2025

elasticsearchmachine commented Nov 19, 2025

elasticsearchmachine commented Nov 19, 2025

dnhatn left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leontyevdv Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

dnhatn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Labels

6 participants

leontyevdv commented Oct 30, 2025 •

edited

Loading

dnhatn left a comment •

edited

Loading

leontyevdv Nov 21, 2025 •

edited

Loading