Skip to content

[ES|QL] Fixes in the bucket function signatures#222553

Merged
stratoula merged 11 commits intoelastic:mainfrom
stratoula:esql-automate-script-cleaunp-bucket
Jun 10, 2025
Merged

[ES|QL] Fixes in the bucket function signatures#222553
stratoula merged 11 commits intoelastic:mainfrom
stratoula:esql-automate-script-cleaunp-bucket

Conversation

@stratoula
Copy link
Contributor

@stratoula stratoula commented Jun 4, 2025

Summary

This PR:

  • Cleanups the buckets hardcoded values and relies on the ES ones
  • Merges the time_duration and time_literal. Actually time_literal is not returned from ES while time_duration is.
  • Fixes some wrong validations in the bucket function such as:
image

Checklist

testErrorsAndWarnings(`row 1 ${timeLiteral.name} + 1 year`, [
`Argument of [+] must be [date], found value [1 year] type [duration]`,
]);
testErrorsAndWarnings(`row 1 ${timeLiteral.name} + 1 year`, []);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the validation check we are sacrificing but it is not an important one

Copy link
Contributor Author

@stratoula stratoula Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also want to raise the importance of these tests. I don't think it is so important to test all these scenarios so possibly a refactoring of the validation tests while re-writing the validation system makes sense @drewdaemon

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ I've been thinking that getting the test suite in order should be a first step or at least a key outcome.

await expectErrors(
'from index | stats by bucket(dateField, 1 + 30 / 10, concat("", ""), "")',
['Argument of [bucket] must be [date], found value [concat("","")] type [keyword]']
[]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were complaining wrongly here

@stratoula stratoula added v9.1.0 v8.19.0 backport:version Backport to applied version labels release_note:fix Feature:ES|QL ES|QL related features in Kibana Team:ESQL ES|QL related features in Kibana t// labels Jun 5, 2025
@stratoula stratoula changed the title [ES|QL] Simplifies the buckets signatures Jun 5, 2025
@stratoula stratoula marked this pull request as ready for review June 5, 2025 13:19
@stratoula stratoula requested a review from a team as a code owner June 5, 2025 13:19
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-esql (Team:ESQL)

@stratoula stratoula requested a review from drewdaemon June 5, 2025 13:20
@elastic elastic deleted a comment from elasticmachine Jun 5, 2025
@drewdaemon
Copy link
Contributor

drewdaemon commented Jun 5, 2025

Merges the time_duration and time_literal. Actually time_literal is not returned from ES while time_duration is

It is actually a little complex. Elasticsearch supports two datatypes of "timespan" things (link)

  1. time_duration — used for shorter spans (3 minutes for example)
  2. date_period — used for larger, calendar-aware spans (1 week for example)

Like you say, these are the types we see in the table columns from ES.

Then, there is a syntax in the ES|QL language of a "timespan literal" which is a language construct (e.g. 1 day, 3 hours) that gets translated on the ES side into one of those two types depending on how large the timespan it describes is.

So the question is: how do we deal with all this in our code?

First, we only had time_literal and I think it was roughly supposed to describe the language construct of “timespan literals” while glossing over the Elasticsearch types. This was before Elasticsearch was providing function metadata.

Then, when I looked into it and discovered these two types, I added them to our data types list (see

) because I wanted it to describe every type that we could possibly expect to see in the table from Elasticsearch (though I don't think they are "official" field types in the sense that they can be in index mappings... that's why they're not in my field types array, but instead in data types).

And, these two ES types are now in our function definitions because they are scraped from ES. You can see that the docs for bucket differentiate these

Screenshot 2025-06-05 at 11 47 45 AM

(I actually asked the ES team to explore merging these types. They discussed it and decided to keep them separate because they thought it was useful to be able to differentiate between calendar-aware timespans and simple timespans IIRC.)

So, today, we have three things floating around: time_duration, date_period, and time_literal which functions roughly as a union of the other two. And when we detect the type of timespan literal expressions, we don’t make an effort to differentiate between the two types a timespan literal could be (you can see I left it as an open question:

// TODO — consider whether we need to be worried about
// differentiating between time_duration, and date_period
// instead of just using time_literal
).

So hopefully that gives a bit of context on why things are like they are today.

I'm not really commenting yes or no on the PR, but I wanted to share some relevant context in case it changes anything.

@stratoula
Copy link
Contributor Author

stratoula commented Jun 6, 2025

Thanx Drew for the context! I admit that I found it complicated to navigate. Which means it needs improvement!

Looking at the codebase I realized we dont need all these types. I preferred to keep the ES types rather than 3. Using time_literal means extra mappings in the script and I dont think we need them. I am exploring further cleanups in another PR. The ES types (although not official ones) should be respected, as they are the source of truth for the signatures, otherwise you need hardcoded signatures and mappings to a kibana type. It also makes it hard to read and understand.

I think this change is towards the right direction. It simplified the code, no need for extra mappings, removes the hardcoded buckets and fixes bugs.

Should we merge them further in the future? Maybe! At some point we also introduce this timeInterval. So there is a meaning of unification at the code. But at this time is a bit of spaggheti. And this PR is moving forward I think. It doesn't introduce any bugs either. (unless I am missing something)

For me the most important thing is to remove the hardcoded buckets (and everything hardcoded in this script) while at the same time make our code simpler.

Copy link
Contributor

@drewdaemon drewdaemon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, @stratoula.

I'm hearing that we'd like to accept the complexity of the two Elasticsearch types and fully remove the time_literal thing as a type. This may mean eventually differentiating the two Elasticsearch types in getExpressionType. As you say, everything doesn't have to be done in this PR.

I don't think time_duration is technically an adequate substitution for time_literal, since time_literal currently subs for both time_duration and date_period.

However, I can't find a regression! My guess is that the changes here work because every argument in the signatures from ES that accepts a time_duration accepts a date_period as well.

timeInterval is an AST node type, so, in my view, it's doing what is expected. It describes the language construct of timespan literals, not a data type.

I think if we remove time_literal, that will remove the extra complexity in our code, leaving only the complexity of the language. This PR is a good step!

@stratoula
Copy link
Contributor Author

Thanx Drew! My thoughts exactly!!

@drewdaemon
Copy link
Contributor

Follow-up task for cleaning up the rest of time_literal!

#223037

@stratoula stratoula enabled auto-merge (squash) June 10, 2025 06:48
@stratoula stratoula merged commit 61299c2 into elastic:main Jun 10, 2025
10 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.19

https://github.com/elastic/kibana/actions/runs/15553776957

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
kbnUiSharedDeps-srcJs 3.7MB 3.7MB +12.6KB
@kibanamachine
Copy link
Contributor

💔 All backports failed

Status Branch Result
8.19 Backport failed because of merge conflicts

You might need to backport the following PRs to 8.19:
- [ES|QL] Add validations and suggestions for COMPLETION (#222533)

Manual backport

To create the backport manually run:

node scripts/backport --pr 222553

Questions ?

Please refer to the Backport tool documentation

stratoula added a commit to stratoula/kibana that referenced this pull request Jun 10, 2025
## Summary

This PR:

- Cleanups the buckets hardcoded values and relies on the ES ones
- Merges the time_duration and time_literal. Actually time_literal is
not returned from ES while time_duration is.
- Fixes some wrong validations in the bucket function such as:

<img width="1262" alt="image"
src="https://github.com/user-attachments/assets/f08eaa27-296f-4428-b501-389435f41b79"
/>

### Checklist
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

(cherry picked from commit 61299c2)

# Conflicts:
#	src/platform/packages/shared/kbn-esql-validation-autocomplete/scripts/generate_function_definitions.ts
@stratoula
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.19

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

stratoula added a commit that referenced this pull request Jun 10, 2025
…3223)

# Backport

This will backport the following commits from `main` to `8.19`:
- [[ES|QL] Fixes in the bucket function signatures
(#222553)](#222553)

<!--- Backport version: 10.0.0 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Stratoula
Kalafateli","email":"efstratia.kalafateli@elastic.co"},"sourceCommit":{"committedDate":"2025-06-10T07:53:11Z","message":"[ES|QL]
Fixes in the bucket function signatures (#222553)\n\n## Summary\n\nThis
PR:\n\n- Cleanups the buckets hardcoded values and relies on the ES
ones\n- Merges the time_duration and time_literal. Actually time_literal
is\nnot returned from ES while time_duration is.\n- Fixes some wrong
validations in the bucket function such as:\n\n<img width=\"1262\"
alt=\"image\"\nsrc=\"https://github.com/user-attachments/assets/f08eaa27-296f-4428-b501-389435f41b79\"\n/>\n\n\n###
Checklist\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common
scenarios","sha":"61299c2eb78cb0c587ffb490dd9558a450cde49f","branchLabelMapping":{"^v9.1.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","Feature:ES|QL","Team:ESQL","backport:version","v9.1.0","v8.19.0"],"title":"[ES|QL]
Fixes in the bucket function
signatures","number":222553,"url":"https://github.com/elastic/kibana/pull/222553","mergeCommit":{"message":"[ES|QL]
Fixes in the bucket function signatures (#222553)\n\n## Summary\n\nThis
PR:\n\n- Cleanups the buckets hardcoded values and relies on the ES
ones\n- Merges the time_duration and time_literal. Actually time_literal
is\nnot returned from ES while time_duration is.\n- Fixes some wrong
validations in the bucket function such as:\n\n<img width=\"1262\"
alt=\"image\"\nsrc=\"https://github.com/user-attachments/assets/f08eaa27-296f-4428-b501-389435f41b79\"\n/>\n\n\n###
Checklist\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common
scenarios","sha":"61299c2eb78cb0c587ffb490dd9558a450cde49f"}},"sourceBranch":"main","suggestedTargetBranches":["8.19"],"targetPullRequestStates":[{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/222553","number":222553,"mergeCommit":{"message":"[ES|QL]
Fixes in the bucket function signatures (#222553)\n\n## Summary\n\nThis
PR:\n\n- Cleanups the buckets hardcoded values and relies on the ES
ones\n- Merges the time_duration and time_literal. Actually time_literal
is\nnot returned from ES while time_duration is.\n- Fixes some wrong
validations in the bucket function such as:\n\n<img width=\"1262\"
alt=\"image\"\nsrc=\"https://github.com/user-attachments/assets/f08eaa27-296f-4428-b501-389435f41b79\"\n/>\n\n\n###
Checklist\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common
scenarios","sha":"61299c2eb78cb0c587ffb490dd9558a450cde49f"}},{"branch":"8.19","label":"v8.19.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->
pmuellr pushed a commit to pmuellr/kibana that referenced this pull request Jun 11, 2025
## Summary

This PR:

- Cleanups the buckets hardcoded values and relies on the ES ones
- Merges the time_duration and time_literal. Actually time_literal is
not returned from ES while time_duration is.
- Fixes some wrong validations in the bucket function such as:

<img width="1262" alt="image"
src="https://github.com/user-attachments/assets/f08eaa27-296f-4428-b501-389435f41b79"
/>


### Checklist
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
nickpeihl pushed a commit to nickpeihl/kibana that referenced this pull request Jun 12, 2025
## Summary

This PR:

- Cleanups the buckets hardcoded values and relies on the ES ones
- Merges the time_duration and time_literal. Actually time_literal is
not returned from ES while time_duration is.
- Fixes some wrong validations in the bucket function such as:

<img width="1262" alt="image"
src="https://github.com/user-attachments/assets/f08eaa27-296f-4428-b501-389435f41b79"
/>


### Checklist
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:version Backport to applied version labels Feature:ES|QL ES|QL related features in Kibana release_note:fix Team:ESQL ES|QL related features in Kibana t// v8.19.0 v9.1.0

4 participants