Use a single array for buffering rate data points by dnhatn · Pull Request #140855 · elastic/elasticsearch

dnhatn · 2026-01-18T02:12:07Z

This change switches to a single array for buffering data points in rate aggregation, providing several benefits:

Minimal memory waste: Previously, each group over-allocated 1/8 or an extra page of the required memory to minimize allocations. For example, with 10,000 time series and 30 time-buckets each, this could reserve an extra 10,000 * 30 * 16,384 = 4,915,200,000 bytes (~4.9GB), and with 8 concurrent drivers, up to 40GB of extra memory.
Fewer objects, making it more GC-friendly.
Easier future disk spill: With a single array, spilling the initial pages to disk is easier to implement.

This approach may result in more "slices" than before, since slices cannot be combined across segments. However, the slice merging process has been improved to reduce this overhead.

Performance tests with 100 hosts (270 million data points) show a 10% reduction in rate_1h response time (from 420ms to 380ms), though performance was not the primary goal of this change.

dnhatn · 2026-01-19T22:33:05Z

Buildkite benchmark this with tsdb-metricsgen-240m-highcardinality please

elasticmachine · 2026-01-19T22:36:06Z

💚 Build Succeeded

Buildkite Build
Commit: d49c6bd
Baseline: e61a5ab (env ID 026d722d-6dff-422f-9fb0-485b9f03af0b)
Contender: d49c6bd (env ID 645d4be1-8ada-4288-a186-abfc2a37bcbb)
Benchmark results

This build ran two tsdb-metricsgen-240m-highcardinality benchmarks to evaluate performance impact of this PR.

History

elasticsearchmachine · 2026-01-19T22:45:05Z

Hi @dnhatn, I've created a changelog YAML for you.

elasticsearchmachine · 2026-01-20T00:49:17Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

kkrik-es · 2026-01-20T09:03:20Z

...ompute/src/main/java/org/elasticsearch/compute/aggregation/AbstractRateGroupingFunction.java

+        }
+
+        final void prepareSlicesOnly(int groupId, long firstTimestamp) {
+            if (valueCount > 0 && sliceGroupIds.get(sliceCount - 1) == groupId) {


Naive question: is there an issue if sliceCount is 0?

I see, it gets incremented on the first value. Maybe an assert would help, just a nit.

I added an assertion in f40616e

...ompute/src/main/java/org/elasticsearch/compute/aggregation/AbstractRateGroupingFunction.java

JonasKunz

LGTM, nice!

...ompute/src/main/java/org/elasticsearch/compute/aggregation/AbstractRateGroupingFunction.java

...src/main/java/org/elasticsearch/compute/aggregation/X-RateGroupingAggregatorFunction.java.st

.../generated-src/org/elasticsearch/compute/aggregation/RateLongGroupingAggregatorFunction.java

kkrik-es

Good job, Nhat. This should be faster, and it'll show up more once we reduce the other overheads.

…pute/aggregation/AbstractRateGroupingFunction.java Co-authored-by: Kostas Krikellas <131142368+kkrik-es@users.noreply.github.com>

…pute/aggregation/AbstractRateGroupingFunction.java Co-authored-by: Jonas Kunz <j+github@kunzj.de>

…gle-array

dnhatn · 2026-01-20T18:48:49Z

@kkrik-es @JonasKunz Thanks for the review!

dnhatn · 2026-01-20T18:57:46Z

@kkrik-es @JonasKunz I did another round of benchmarking. I think this change reduced the query time of rate_1h for 100 hosts by around 20% (from 440ms to 355ms), instead of the previously estimated 10%. I hope this improvement will be reflected in our competitive benchmark.

This change switches to a single array for buffering data points in rate aggregation, providing several benefits: 1. Minimal memory waste: Previously, each group over-allocated 1/8 or an extra page of the required memory to minimize allocations. For example, with 10,000 time series and 30 time-buckets each, this could reserve an extra 10,000 * 30 * 16,384 = 4,915,200,000 bytes (~4.9GB), and with 8 concurrent drivers, up to 40GB of extra memory. 2. Fewer objects, making it more GC-friendly. 3. Easier future disk spill: With a single array, spilling the initial pages to disk is easier to implement. This approach may result in more "slices" than before, since slices cannot be combined across segments. However, the slice merging process has been improved to reduce this overhead. Performance tests with 100 hosts (270 million data points) show a 10% reduction in rate_1h response time (from 420ms to 380ms), though performance was not the primary goal of this change.

dnhatn · 2026-01-21T20:45:36Z

💚 All backports created successfully

Status	Branch	Result
✅	9.3

Questions ?

Please refer to the Backport tool documentation

This change switches to a single array for buffering data points in rate aggregation, providing several benefits: 1. Minimal memory waste: Previously, each group over-allocated 1/8 or an extra page of the required memory to minimize allocations. For example, with 10,000 time series and 30 time-buckets each, this could reserve an extra 10,000 * 30 * 16,384 = 4,915,200,000 bytes (~4.9GB), and with 8 concurrent drivers, up to 40GB of extra memory. 2. Fewer objects, making it more GC-friendly. 3. Easier future disk spill: With a single array, spilling the initial pages to disk is easier to implement. This approach may result in more "slices" than before, since slices cannot be combined across segments. However, the slice merging process has been improved to reduce this overhead. Performance tests with 100 hosts (270 million data points) show a 10% reduction in rate_1h response time (from 420ms to 380ms), though performance was not the primary goal of this change. (cherry picked from commit 5c1521c)

elasticsearchmachine added the v9.4.0 label Jan 18, 2026

dnhatn force-pushed the rate-single-array branch 7 times, most recently from 371d577 to 0c5a312 Compare January 19, 2026 20:23

Use single buffer for rate

d49c6bd

dnhatn force-pushed the rate-single-array branch from a1262fb to d49c6bd Compare January 19, 2026 22:32

elastic deleted a comment from elasticmachine Jan 19, 2026

dnhatn changed the title ~~WIP - single array for rate~~ Jan 19, 2026

dnhatn added :StorageEngine/ES|QL Timeseries / metrics / PromQL / logsdb capabilities in ES|QL >enhancement labels Jan 19, 2026

Update docs/changelog/140855.yaml

514be1b

dnhatn requested review from JonasKunz and kkrik-es January 20, 2026 00:48

dnhatn marked this pull request as ready for review January 20, 2026 00:48

elasticsearchmachine added the Team:StorageEngine label Jan 20, 2026

kkrik-es reviewed Jan 20, 2026

View reviewed changes

...ompute/src/main/java/org/elasticsearch/compute/aggregation/AbstractRateGroupingFunction.java Outdated Show resolved Hide resolved

JonasKunz approved these changes Jan 20, 2026

View reviewed changes

kkrik-es reviewed Jan 20, 2026

View reviewed changes

.../generated-src/org/elasticsearch/compute/aggregation/RateLongGroupingAggregatorFunction.java Show resolved Hide resolved

kkrik-es approved these changes Jan 20, 2026

View reviewed changes

dnhatn and others added 2 commits January 20, 2026 08:08

Update x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/com…

923793a

…pute/aggregation/AbstractRateGroupingFunction.java Co-authored-by: Kostas Krikellas <131142368+kkrik-es@users.noreply.github.com>

Update x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/com…

63763ca

…pute/aggregation/AbstractRateGroupingFunction.java Co-authored-by: Jonas Kunz <j+github@kunzj.de>

dnhatn added 7 commits January 20, 2026 08:10

Merge remote-tracking branch 'elastic/main' into rate-single-array

8840ee0

comment

ec63527

Merge remote-tracking branch 'dnhatn/rate-single-array' into rate-sin…

48a6326

…gle-array

comment

b27522e

comment

18b23d1

stylecheck

46c4bac

last group id

f40616e

dnhatn merged commit 5c1521c into elastic:main Jan 20, 2026
35 checks passed

dnhatn deleted the rate-single-array branch January 20, 2026 18:48

dnhatn added the v9.3.1 label Jan 21, 2026

dnhatn mentioned this pull request Jan 21, 2026

[9.3] Use a single array for buffering rate data points (#140855) #141081

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a single array for buffering rate data points#140855

Use a single array for buffering rate data points#140855
dnhatn merged 11 commits intoelastic:mainfrom
dnhatn:rate-single-array

dnhatn commented Jan 18, 2026 •

edited

Loading

dnhatn commented Jan 19, 2026

elasticmachine commented Jan 19, 2026 •

edited

Loading

elasticsearchmachine commented Jan 19, 2026

elasticsearchmachine commented Jan 20, 2026

kkrik-es Jan 20, 2026

kkrik-es Jan 20, 2026

dnhatn Jan 20, 2026

Uh oh!

JonasKunz left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kkrik-es left a comment

dnhatn commented Jan 20, 2026

Uh oh!

dnhatn commented Jan 20, 2026

dnhatn commented Jan 21, 2026

Labels

5 participants

Conversation

dnhatn commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

dnhatn commented Jan 19, 2026

elasticmachine commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💚 Build Succeeded

History

elasticsearchmachine commented Jan 19, 2026

elasticsearchmachine commented Jan 20, 2026

kkrik-es Jan 20, 2026

Choose a reason for hiding this comment

kkrik-es Jan 20, 2026

Choose a reason for hiding this comment

dnhatn Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

JonasKunz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kkrik-es left a comment

Choose a reason for hiding this comment

dnhatn commented Jan 20, 2026

Uh oh!

dnhatn commented Jan 20, 2026

dnhatn commented Jan 21, 2026

💚 All backports created successfully

Questions ?

Labels

5 participants

dnhatn commented Jan 18, 2026 •

edited

Loading

elasticmachine commented Jan 19, 2026 •

edited

Loading