ES|QL: Push down COUNT(*) BY DATE_TRUNC by GalLalouche · Pull Request #138023 · elastic/elasticsearch

GalLalouche · 2025-11-13T14:56:38Z

Pushes down count(*) by round_to to Lucene.
Example query:

FROM employees
| STATS COUNT(*) BY DATE_TRUNC(1 YEAR, hire_date)

This is actually a culmination of several rules:

ReplaceDateTruncBucketWithRoundTo Replaces the DATE_TRUNC with a ROUND_TO
ReplaceRoundToWithQueryAndTags Replaces the ROUND_TO with query and tags.
PushCountQueryAndTagsToSource (This PR) Pushes the aggregation down to Lucene.

Note that a query with a filter is not yet supported, but will be done a follow-up PR.

FROM employees
| STATS COUNT(*) WHERE hire_date > "1985-01-01" BY d=DATE_TRUNC(1 YEAR, hire_date)

GalLalouche · 2025-11-13T14:57:51Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/physical/EsStatsQueryExec.java

    private final Expression limit;
    private final List<Attribute> attrs;
-    private final List<Stat> stats;
+    private final Stat stat;


I've refactored this since we don't support multiple aggregates right now anyway. When we do, we can turn this back into a list.

GalLalouche · 2025-11-13T15:14:10Z

.../java/org/elasticsearch/xpack/esql/optimizer/rules/physical/local/SubtituteRoundToTests.java

-public class ReplaceRoundToWithQueryAndTagsTests extends AbstractLocalPhysicalPlanOptimizerTests {
-
-    public ReplaceRoundToWithQueryAndTagsTests(String name, Configuration config) {
+public class SubtituteRoundToTests extends AbstractLocalPhysicalPlanOptimizerTests {


Renamed since it now tests for both rewrites in the same rule batch.

I've also refactored this to reduce some of the duplication.

This could have really benefited from the planned golden test feature!

Can we add some tests for INLINE STATS with count + date_histogram as well, if there isn't yet? Having some CsvTests for them will be great. Just to make sure the new filter(>0) added does not give us troubles for the inline join after the aggregation.

fork and subquery may also have aggregation inside the branches, having some additional tests for them will give us extra confidence.

I've added 3 spec tests (inline stats, fork, subquery) and two plan tests to SubstituteRoundToTests (fork and inline stats; as discussed offline, we can't test inline stats right now).

…icsearch into feature/count_by_trunc

elasticsearchmachine · 2025-11-14T13:26:58Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2025-11-14T13:27:21Z

Hi @GalLalouche, I've created a changelog YAML for you.

fang-xing-esql

Thank you @GalLalouche , the new rule added makes sense to me. I added comments around additional tests, they will give us extra confidence of this change.

I'm just curious if there are any early performance results of this change yet? It will be really exciting to see the improvements.

I'll leave the review of the changes in operators to Nik.

...ugin/esql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/EsqlSpecTestCase.java

...g/elasticsearch/xpack/esql/optimizer/rules/physical/local/PushCountQueryAndTagsToSource.java

fang-xing-esql · 2025-11-18T23:16:40Z

.../java/org/elasticsearch/xpack/esql/optimizer/rules/physical/local/SubtituteRoundToTests.java

-public class ReplaceRoundToWithQueryAndTagsTests extends AbstractLocalPhysicalPlanOptimizerTests {
-
-    public ReplaceRoundToWithQueryAndTagsTests(String name, Configuration config) {
+public class SubtituteRoundToTests extends AbstractLocalPhysicalPlanOptimizerTests {


Can we add some tests for INLINE STATS with count + date_histogram as well, if there isn't yet? Having some CsvTests for them will be great. Just to make sure the new filter(>0) added does not give us troubles for the inline join after the aggregation.

fork and subquery may also have aggregation inside the branches, having some additional tests for them will give us extra confidence.

fang-xing-esql · 2025-11-18T23:31:48Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/physical/EsStatsQueryExec.java

+        public List<ElementType> tagTypes() {
+            return List.of(switch (queryBuilderAndTags.getFirst().tags().getFirst()) {
+                case Integer i -> ElementType.INT;
+                case Long l -> ElementType.LONG;


Is there a particular reason that double is not supported?

Since we only support COUNT pushdown at the moment. I can simplify this by removing StatsType altogether, so it's clearer, but I wanted to avoid overly complicating this PR.

…icsearch into feature/count_by_trunc

GalLalouche · 2025-11-24T15:03:05Z

Thanks for the review @fang-xing-esql! I also added logic and test to verify we don't push ES Filters either (or any filter for that matter), after @nik9000 mentioned that it's another edge case we best avoid at the moment.

fang-xing-esql

Thank you @GalLalouche! LGTM, only a couple of minor comments.

...java/org/elasticsearch/xpack/esql/optimizer/rules/physical/local/SubstituteRoundToTests.java

...ugin/esql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/EsqlSpecTestCase.java

…icsearch into feature/count_by_trunc

Pushes down `count(*) by round_to` to Lucene. Example query: ``` FROM employees | STATS COUNT(*) BY DATE_TRUNC(1 YEAR, hire_date) ``` This is actually a culmination of several rules: 1. `ReplaceDateTruncBucketWithRoundTo` Replaces the `DATE_TRUNC` with a `ROUND_TO` 2. `ReplaceRoundToWithQueryAndTags` Replaces the `ROUND_TO` with query and tags. 3. `PushCountQueryAndTagsToSource` (This PR) Pushes the aggregation down to Lucene. Note that a query with a filter is not yet supported, but will be done a follow-up PR. ``` FROM employees | STATS COUNT(*) WHERE hire_date > "1985-01-01" BY d=DATE_TRUNC(1 YEAR, hire_date) ```

…38616) Add missing capability to test added in #138023. Resolves: #138601.

This PR adds filter support for pushing down `COUNT(*) BY DATE_TRUNC` introduced in #138023, i.e., ``` FROM idx | WHERE date > NOW() - 90 DAYS | STATS COUNT(*) BY DATE_TRUNC(1 DAY, date) ``` Note that we still won't push down queries where the filter is *on* the `COUNT(*)` since that case is a lot more complicated due to having to keep zeros for filtered buckets: ``` FROM idx | STATS COUNT(*) WHERE date > NOW() - 90 DAYS BY DATE_TRUNC(1 DAY, date) ```

This PR adds golden tests to ES|QL! ● I added a GoldenTestsReadme.md with instructions on how to add new golden tests. ● I tried to keep this (relatively) minimal for the first PR. In the future we can add more features, configurations, stages (e.g., local node_reduce), etc. ● As a proof of concept, I've re-implemented (most) of the tests in SubstitudeRoundToTests using the golden testing framework. ○ When I recently worked on these tests (#138023,#138765), the majority of my time was spent mechanically fixing the expected output in the tests. ○ This was a good use case since I needed to add more support for test nesting due to the loops in the original tests. ○ While more text is being produced when using golden tests (70k lines!), fixing the expected output is a lot simpler, since you can just bulldoze the output! In this case, eyeballing the difference is enough to be convinced that the change is correct. So the maintenance cost goes down. ○ One of the tests (testRoundToWithTimeSeriesIndices) was skipped because its output isn't actually consistent across runs. The original test only verifies a very specific subset of the AST, so it's not a good candidate for golden tests.

This PR adds golden tests to ES|QL! ● I added a GoldenTestsReadme.md with instructions on how to add new golden tests. ● I tried to keep this (relatively) minimal for the first PR. In the future we can add more features, configurations, stages (e.g., local node_reduce), etc. ● As a proof of concept, I've re-implemented (most) of the tests in SubstitudeRoundToTests using the golden testing framework. ○ When I recently worked on these tests (elastic#138023,elastic#138765), the majority of my time was spent mechanically fixing the expected output in the tests. ○ This was a good use case since I needed to add more support for test nesting due to the loops in the original tests. ○ While more text is being produced when using golden tests (70k lines!), fixing the expected output is a lot simpler, since you can just bulldoze the output! In this case, eyeballing the difference is enough to be convinced that the change is correct. So the maintenance cost goes down. ○ One of the tests (testRoundToWithTimeSeriesIndices) was skipped because its output isn't actually consistent across runs. The original test only verifies a very specific subset of the AST, so it's not a good candidate for golden tests.

GalLalouche added 11 commits November 3, 2025 20:44

TEMP

476ae33

Add spec test

41e45dd

TEMP

6575c6e

more temp

567f9b0

More temp, fix output type with hack

450837d

temp test passes kinda

5ba5ea5

All IT tests pass

d2dba94

Extract to another rewrite, fix tests

3a10875

added tests, fixed more tests

663a28c

Ready for draft PR!

bc2b7a1

[ESQL] Push count_by round to source

63dd34d

elasticsearchmachine added the v9.3.0 label Nov 13, 2025

GalLalouche commented Nov 13, 2025

View reviewed changes

GalLalouche added 2 commits November 13, 2025 21:28

Fix borken tests

f4478f7

Merge branch 'feature/count_by_trunc' of github.com:GalLalouche/elast…

3adf49a

…icsearch into feature/count_by_trunc

GalLalouche mentioned this pull request Nov 13, 2025

KMeansLocalTests.testComputeNeighbours: <0.48484848484848486> was less than <0.5> #138064

Closed

GalLalouche marked this pull request as ready for review November 14, 2025 13:13

GalLalouche requested a review from nik9000 November 14, 2025 13:13

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Nov 14, 2025

Merge branch 'main' into feature/count_by_trunc

53ac28e

GalLalouche added >feature Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL labels Nov 14, 2025

elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Nov 14, 2025

Update docs/changelog/138023.yaml

ec245fc

Merge branch 'main' into feature/count_by_trunc

59e09f8

GalLalouche requested a review from fang-xing-esql November 17, 2025 14:27

Rename test class to fix typo

676a397

fang-xing-esql reviewed Nov 18, 2025

View reviewed changes

GalLalouche added 6 commits November 19, 2025 13:44

Fix EsFilter

1e3b053

TEMP

dd09bfa

Merge branch 'feature/count_by_trunc' of github.com:GalLalouche/elast…

55ee275

…icsearch into feature/count_by_trunc

Added some specs

d63c07c

Add more goldenish tests

25c46f4

Merge branch 'main' into feature/count_by_trunc

4ad4ead

GalLalouche requested a review from fang-xing-esql November 24, 2025 15:02

fang-xing-esql approved these changes Nov 24, 2025

View reviewed changes

...java/org/elasticsearch/xpack/esql/optimizer/rules/physical/local/SubstituteRoundToTests.java Outdated Show resolved Hide resolved

...ugin/esql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/EsqlSpecTestCase.java Outdated Show resolved Hide resolved

GalLalouche added 2 commits November 25, 2025 11:29

Final CR fixes

e2017bb

Merge branch 'main' into feature/count_by_trunc

ee152bd

GalLalouche changed the title ~~ES|QL: Push down count~~ Nov 25, 2025

GalLalouche added 2 commits November 25, 2025 11:52

Update docs

66947bd

Merge branch 'feature/count_by_trunc' of github.com:GalLalouche/elast…

00ad7a4

…icsearch into feature/count_by_trunc

GalLalouche merged commit 96e799a into elastic:main Nov 25, 2025
34 checks passed

GalLalouche mentioned this pull request Nov 25, 2025

ESQL: Unmute and fix missing cabability in SubstituteRoundToTests #138616

Merged

GalLalouche added a commit that referenced this pull request Nov 26, 2025

ESQL: Unmute and fix missing cabability in SubstituteRoundToTests (#1…

9df6098

…38616) Add missing capability to test added in #138023. Resolves: #138601.

GalLalouche mentioned this pull request Dec 1, 2025

Feature/count by trunc with filter #138765

Merged

GalLalouche mentioned this pull request Dec 16, 2025

ESQL: Golden tests #139598

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ES|QL: Push down COUNT(*) BY DATE_TRUNC#138023

ES|QL: Push down COUNT(*) BY DATE_TRUNC#138023
GalLalouche merged 27 commits intoelastic:mainfrom
GalLalouche:feature/count_by_trunc

GalLalouche commented Nov 13, 2025 •

edited

Loading

GalLalouche Nov 13, 2025

GalLalouche Nov 13, 2025

fang-xing-esql Nov 18, 2025

GalLalouche Nov 24, 2025

elasticsearchmachine commented Nov 14, 2025

elasticsearchmachine commented Nov 14, 2025

fang-xing-esql left a comment

Uh oh!

Uh oh!

fang-xing-esql Nov 18, 2025

fang-xing-esql Nov 18, 2025

GalLalouche Nov 19, 2025

GalLalouche commented Nov 24, 2025

fang-xing-esql left a comment

Uh oh!

Uh oh!

Uh oh!

Labels

3 participants

Conversation

GalLalouche commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GalLalouche Nov 13, 2025

Choose a reason for hiding this comment

GalLalouche Nov 13, 2025

Choose a reason for hiding this comment

fang-xing-esql Nov 18, 2025

Choose a reason for hiding this comment

GalLalouche Nov 24, 2025

Choose a reason for hiding this comment

elasticsearchmachine commented Nov 14, 2025

elasticsearchmachine commented Nov 14, 2025

fang-xing-esql left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fang-xing-esql Nov 18, 2025

Choose a reason for hiding this comment

fang-xing-esql Nov 18, 2025

Choose a reason for hiding this comment

GalLalouche Nov 19, 2025

Choose a reason for hiding this comment

GalLalouche commented Nov 24, 2025

fang-xing-esql left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Labels

3 participants

GalLalouche commented Nov 13, 2025 •

edited

Loading