Conversation
Conflicts: x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec
Conflicts: x-pack/plugin/esql/qa/testFixtures/src/main/java/org/elasticsearch/xpack/esql/LoadMapping.java x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec
…-all-over-time' into esql-group-by-all-over-time
# Conflicts: # server/src/main/resources/transport/upper_bounds/8.18.csv # server/src/main/resources/transport/upper_bounds/8.19.csv # server/src/main/resources/transport/upper_bounds/9.0.csv # server/src/main/resources/transport/upper_bounds/9.1.csv # server/src/main/resources/transport/upper_bounds/9.2.csv # server/src/main/resources/transport/upper_bounds/9.3.csv # x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/TranslateTimeSeriesAggregate.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Drop.java
Cleanup Part of elastic#136253
Ignore GroupByAll in GenerativeForkRestTest Part of elastic#136253
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
|
Hi @leontyevdv, I've updated the changelog YAML for you. |
| * DROP commands are parsed into Drop objects, which the {@link org.elasticsearch.xpack.esql.analysis.Analyzer.ResolveRefs} rule then | ||
| * rewrites into {@link org.elasticsearch.xpack.esql.plan.logical.local.EsqlProject} plans, along with other projection-like commands. | ||
| * As such, Drop is neither serializable nor able to be mapped to a corresponding physical plan. | ||
| */ |
There was a problem hiding this comment.
Can we revert this change? Adding Javadoc is great, but it should be done in a separate change.
| boolean ignoreError = hasProjectAwayColumns || hasLookupJoinExec || hasTextGroupingInTimeSeries; | ||
|
|
||
| // TranslateTimeSeriesAggregate may add a _timeseries attribute into the projection | ||
| boolean hasTimeseriesReplacingTsid = expectedOutputAttributes.stream() |
There was a problem hiding this comment.
Can we check TimeSeriesAggregate only? This could slow down simple queries like FROM x when the output has many fields.
| Map.entry(GEOHEX, (source, fieldEval) -> new ToStringFromGeoGridEvaluator.Factory(source, fieldEval, GEOHEX)), | ||
| Map.entry(AGGREGATE_METRIC_DOUBLE, ToStringFromAggregateMetricDoubleEvaluator.Factory::new) | ||
| Map.entry(AGGREGATE_METRIC_DOUBLE, ToStringFromAggregateMetricDoubleEvaluator.Factory::new), | ||
| Map.entry(TSID_DATA_TYPE, ToStringFromTsidEvaluator.Factory::new) |
There was a problem hiding this comment.
Should we support tsid in Base64 for and surrogate this to Base64 instead?
There was a problem hiding this comment.
Yes, this will be better. I reverted toString and adjusted Base64 to support _tsid. Thank you!
|
|
||
| public LogicalPlan rule(TimeSeriesAggregate aggregate) { | ||
| boolean hasTopLevelOverTimeAggs = false; | ||
| List<NamedExpression> newAggregateFunctions = new ArrayList<>(); |
There was a problem hiding this comment.
nit: init with the size form aggregate.aggregates()
| newAggregateFunctions.add(new Alias(alias.source(), alias.name(), new Values(tsAgg.source(), tsAgg))); | ||
| } else { | ||
| // Preserve non-time-series aggregates | ||
| newAggregateFunctions.add(agg); |
There was a problem hiding this comment.
I think we should fail if we have mixed aggregates: with and without outer aggregation
There was a problem hiding this comment.
I added a check for this below. If there are mixed aggs, we throw an exception now.
Remove Drop JavaDoc Part of elastic#136253
Address the PR's comments Part of elastic#136253
Address the PR's comments Part of elastic#136253
# Conflicts: # x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec
Address the PR's comments Part of elastic#136253
Address the PR's comments Part of elastic#136253
Address the PR's comments Part of elastic#136253
| ); | ||
| boolean ignoreError = hasProjectAwayColumns || hasLookupJoinExec || hasTextGroupingInTimeSeries; | ||
|
|
||
| // TranslateTimeSeriesAggregate may add a _timeseries attribute into the projection |
There was a problem hiding this comment.
We need this because in the LogicalPlanOptimizer.optimize() line 123 we take an expected output from the verified plan and the _timeseries attribute is not there. It appears in the optimized plan.
# Conflicts: # x-pack/plugin/esql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/generative/GenerativeForkRestTest.java
x-pack/plugin/esql/qa/testFixtures/src/main/resources/k8s-timeseries.csv-spec
Show resolved
Hide resolved
| ); | ||
| List<Expression> groupings = new ArrayList<>(); | ||
| groupings.add(timeSeries); | ||
| groupings.addAll(aggregate.groupings()); |
There was a problem hiding this comment.
@kkrik-es Should we allow or reject TS .. | STATS rate(x) BY cluster. I am okay with either option.
There was a problem hiding this comment.
My 2c is that this is a bit confusing. It suggests that we're just grouping by cluster but there's no outer aggregation for it. I understand that it groups by all and then also adds the cluster column but I don't think that's self-evident.
Maybe it's something we should leave out for now, it's something we can easily add later as well. Maybe it makes more sense in combination with an explicit BY DIMENSIONS(), cluster. But I'm skeptical if it's an important use case to support both an implicit and explicit grouping at the same time.
There was a problem hiding this comment.
Right, I'd rather keep the semantics simple and reject this. We either provide no grouping attribute, and group by all, or provide one or more grouping attributes along with a reduction function (outer agg).
There was a problem hiding this comment.
I added a check that rejects such cases. It allows to use GroupingFunction (Bucket, TBucket, Categorize) but rejects attributes. This applies to the time-series aggregates only.
Address the PR's comments Part of elastic#136253
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/VerifierTests.java
Show resolved
Hide resolved
Remove support of the grouping attributes for the bare time-series aggregates. Part of elastic#136253
Fix merge. Part of elastic#136253
Adds support for bare time-series aggregate functions and a _timeseries column into the output. The _timeseries column contain a BASE64 representation of a _tsid value. Part of elastic#136253 --------- Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
The final goal of the task #136253 is to output a list of dimensions for the time-series group.
We split this work into the three steps:
Example:
Load a list of dimensions and output them in the _timeseries column instead of _tsid (here is a doc about it).
Make the optimization to load a doc per _tsid.
This PR adds support for bare time-series aggregate functions and the _timeseries column into the output (step 1).
Part of #136253