Avoid rewrite round_to with expensive queries#135987
Conversation
| return Math.ceilDiv(threshold, clauses); | ||
| } | ||
|
|
||
| static int estimateQueryClauses(QueryBuilder q) { |
There was a problem hiding this comment.
This is a rough estimate - any suggestions are welcome.
There was a problem hiding this comment.
This looks good to me.
Would we also want to handle leaf query builders that target doc value only fields differently then if it targets an indexed field? I guess if that is the case, then that should be for another change.
There was a problem hiding this comment.
+1. We might also need to convert to queries, rewrite them, then estimate.
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Hi @dnhatn, I've created a changelog YAML for you. |
martijnvg
left a comment
There was a problem hiding this comment.
I left a few questions, but this LGTM!
| if (q == null || q instanceof MatchAllQueryBuilder || q instanceof MatchNoneQueryBuilder) { | ||
| return 0; | ||
| } | ||
| if (q instanceof WildcardQueryBuilder || q instanceof RegexpQueryBuilder || q instanceof PrefixQueryBuilder) { |
There was a problem hiding this comment.
I think we want to add FuzzyQueryBuilder here as well?
Or maybe we should check for MultiTermQueryBuilder? (but that also includes range query builder, which if indexed should count as one, I think?)
| return Math.ceilDiv(threshold, clauses); | ||
| } | ||
|
|
||
| static int estimateQueryClauses(QueryBuilder q) { |
There was a problem hiding this comment.
This looks good to me.
Would we also want to handle leaf query builders that target doc value only fields differently then if it targets an indexed field? I guess if that is the case, then that should be for another change.
| } | ||
| if (q instanceof MultiTermQueryBuilder) { | ||
| return 3; | ||
| } |
There was a problem hiding this comment.
Should we also score phrase queries differently?
...ticsearch/xpack/esql/optimizer/rules/physical/local/ReplaceRoundToWithQueryAndTagsTests.java
Outdated
Show resolved
Hide resolved
… round_to_expensive_queries
|
@martijnvg Thank you so much for the quick review! |
| int clauses = estimateQueryClauses(stats, query) + 1; | ||
| if (indexMode == IndexMode.TIME_SERIES) { | ||
| // No doc partitioning for time_series sources; increase the threshold to trade overhead for parallelism. | ||
| threshold *= 2; |
There was a problem hiding this comment.
Super nit: conside adding constants for these numbers (2, 5 etc).
Today, we use a threshold (defaults to 128) to avoid generating too many sub-queries when replacing round_to with sub-queries. However, we do not account for cases where the main query is expensive. In such cases, running many expensive queries is slower and more costly than running a single query and then reading values and rounding. Our benchmark shows that this query takes 800ms with query-and-tags, but only 40ms without it. TS metric* | WHERE host.name LIKE \"host-*\" AND @timestamp >= \"2025-07-25T12:55:59.000Z\" AND @timestamp <= \"2025-07-25T17:25:59.000Z\" | STATS AVG(AVG_OVER_TIME(`metrics.system.cpu.load_average.1m`)) BY host.name, TBUCKET(5 minutes) And this query: TS new_metrics* | WHERE host.name IN("host-0", "host-1", "host-2") AND @timestamp >= "2025-07-25T12:55:59.000Z" AND @timestamp <= "2025-07-25T17:25:59.000Z" | STATS AVG(AVG_OVER_TIME(`metrics.system.cpu.load_average.1m`)) BY host.name, TBUCKET(5 minutes) reduces from 50ms to 10ms. This change proposes using the threshold as the number of query clauses and assigning higher weights to expensive queries, such as wildcard or prefix queries. This allows us to disable the rewrite when it is less efficient, while still enabling it if the number of sub-queries is small.
💚 Backport successful
|
Today, we use a threshold (defaults to 128) to avoid generating too many sub-queries when replacing round_to with sub-queries. However, we do not account for cases where the main query is expensive. In such cases, running many expensive queries is slower and more costly than running a single query and then reading values and rounding. Our benchmark shows that this query takes 800ms with query-and-tags, but only 40ms without it. TS metric* | WHERE host.name LIKE \"host-*\" AND @timestamp >= \"2025-07-25T12:55:59.000Z\" AND @timestamp <= \"2025-07-25T17:25:59.000Z\" | STATS AVG(AVG_OVER_TIME(`metrics.system.cpu.load_average.1m`)) BY host.name, TBUCKET(5 minutes) And this query: TS new_metrics* | WHERE host.name IN("host-0", "host-1", "host-2") AND @timestamp >= "2025-07-25T12:55:59.000Z" AND @timestamp <= "2025-07-25T17:25:59.000Z" | STATS AVG(AVG_OVER_TIME(`metrics.system.cpu.load_average.1m`)) BY host.name, TBUCKET(5 minutes) reduces from 50ms to 10ms. This change proposes using the threshold as the number of query clauses and assigning higher weights to expensive queries, such as wildcard or prefix queries. This allows us to disable the rewrite when it is less efficient, while still enabling it if the number of sub-queries is small.
Today, we use a threshold (defaults to 128) to avoid generating too many sub-queries when replacing
round_towith sub-queries. However, we do not account for cases where the main query is expensive. In such cases, running many expensive queries is slower and more costly than running a single query and then reading values and rounding. Our benchmark shows that this query takes 800ms with query-and-tags, but only 40ms without it.And this query:
reduces from 50ms to 10ms.
This change proposes using the threshold as the number of query clauses and assigning higher weights to expensive queries, such as wildcard or prefix queries. This allows us to disable the rewrite when it is less efficient, while still enabling it if the number of sub-queries is small.
I consider this a bug and will backport it to 9.2.1.