Ensure ordinal builder emit ordinal blocks by dnhatn · Pull Request #127949 · elastic/elasticsearch

dnhatn · 2025-05-09T02:37:03Z

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 tsid values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 tsid values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing.

This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains ord=1 and ord=9999, ordinal blocks still won't be emitted. Allocating a bitset or an array for value_count could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely.

This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too).

The execution time of this query is reduced from 3.4s to 1.9s with 11M documents.

POST /_query
{
    "profile": true,
    "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)"
}

"took": 3475,
"is_partial": false,
"documents_found": 11368089,
"values_loaded": 34248167

"took": 1965,
"is_partial": false,
"documents_found": 11368089,
"values_loaded": 34248167

elasticsearchmachine · 2025-05-09T04:03:09Z

Hi @dnhatn, I've created a changelog YAML for you.

elasticsearchmachine · 2025-05-09T04:05:12Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

martijnvg

LGTM - I think keeping track of min and max ord is an improved heuristic whether to build ordinals block.

kkrik-es · 2025-05-09T08:54:40Z

...ugin/esql/compute/src/main/java/org/elasticsearch/compute/data/SingletonOrdinalsBuilder.java


    BytesRefBlock buildOrdinal() {
-        int valueCount = docValues.getValueCount();
+        int valueCount = maxOrd - minOrd + 1;


Should this be changed? The values are not guaranteed to be dense, iiuc?

Yes, this is the estimate. I will work on on a follow-up to resolve this completely.

elasticsearchmachine · 2025-05-09T16:05:24Z

💚 Backport successful

Status	Branch	Result
✅	8.19
✅	8.18

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

@timestamp

Currently, if a field has high cardinality, we may mistakenly disable emitting ordinal blocks. For example, with 10,000 `tsid` values, we never emit ordinal blocks during reads, even though we could emit blocks for 10 `tsid` values across 1,000 positions. This bug disables optimizations for value aggregation and block hashing. This change tracks the minimum and maximum seen ordinals and uses them as an estimate for the number of ordinals. However, if a page contains `ord=1` and `ord=9999`, ordinal blocks still won't be emitted. Allocating a bitset or an array for `value_count` could track this more accurately but would require additional memory. I need to think about this trade off more before opening another PR to fix this issue completely. This is a quick, contained fix that significantly speeds up time-series aggregation (and other queries too). The execution time of this query is reduced from 3.4s to 1.9s with 11M documents. ``` POST /_query { "profile": true, "query": "TS metrics-hostmetricsreceiver.otel-default | STATS cpu = avg(avg_over_time(`metrics.system.cpu.load_average.1m`)) BY host.name, BUCKET(@timestamp, 5 minute)" } ``` ``` "took": 3475, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ``` ``` "took": 1965, "is_partial": false, "documents_found": 11368089, "values_loaded": 34248167 ```

elasticsearchmachine added the v9.1.0 label May 9, 2025

Ensure ordinal builder emit ordinal blocks

b1a5eb1

dnhatn force-pushed the fix-ordinal-reader branch from 5e8a441 to b1a5eb1 Compare May 9, 2025 03:23

dnhatn added v8.19.0 v8.18.2 >bug :Analytics/ES|QL AKA ESQL labels May 9, 2025

dnhatn requested review from kkrik-es, martijnvg and nik9000 May 9, 2025 04:02

Update docs/changelog/127949.yaml

8e8a0a8

dnhatn marked this pull request as ready for review May 9, 2025 04:04

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 9, 2025

dnhatn added 3 commits May 8, 2025 21:07

restore

182300b

fix

350f2b1

minimal

71dee84

martijnvg approved these changes May 9, 2025

View reviewed changes

kkrik-es reviewed May 9, 2025

View reviewed changes

kkrik-es approved these changes May 9, 2025

View reviewed changes

dnhatn mentioned this pull request May 9, 2025

Speed up time-series aggregation #127444

Closed

28 tasks

dnhatn added the auto-backport Automatically create backport pull requests when merged label May 9, 2025

dnhatn merged commit 2ea23da into elastic:main May 9, 2025
17 checks passed

dnhatn deleted the fix-ordinal-reader branch May 9, 2025 16:03

This was referenced May 9, 2025

[8.19] Ensure ordinal builder emit ordinal blocks (#127949) #127979

Merged

[8.18] Ensure ordinal builder emit ordinal blocks (#127949) #127980

Merged

dnhatn added v9.0.2 backport pending labels May 9, 2025

dnhatn mentioned this pull request May 20, 2025

Ensure ordinal builder emit ordinal blocks (#127949) #128168

Merged

dnhatn removed the backport pending label May 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure ordinal builder emit ordinal blocks#127949

Ensure ordinal builder emit ordinal blocks#127949
dnhatn merged 5 commits intoelastic:mainfrom
dnhatn:fix-ordinal-reader

dnhatn commented May 9, 2025 •

edited

Loading

elasticsearchmachine commented May 9, 2025

elasticsearchmachine commented May 9, 2025

martijnvg left a comment

kkrik-es May 9, 2025

dnhatn May 9, 2025

Uh oh!

elasticsearchmachine commented May 9, 2025

Labels

4 participants

Conversation

dnhatn commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

elasticsearchmachine commented May 9, 2025

elasticsearchmachine commented May 9, 2025

martijnvg left a comment

Choose a reason for hiding this comment

kkrik-es May 9, 2025

Choose a reason for hiding this comment

dnhatn May 9, 2025

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented May 9, 2025

💚 Backport successful

Labels

4 participants

dnhatn commented May 9, 2025 •

edited

Loading