Skip to content

[ES|QL] Making the Expression Suggestor More Semantically Intelligent#241081

Merged
bartoval merged 6 commits intoelastic:mainfrom
bartoval:improve_suggestions
Oct 30, 2025
Merged

[ES|QL] Making the Expression Suggestor More Semantically Intelligent#241081
bartoval merged 6 commits intoelastic:mainfrom
bartoval:improve_suggestions

Conversation

@bartoval
Copy link
Contributor

@bartoval bartoval commented Oct 29, 2025

Summary

#239507

This PR has a multiple purpose:

  • Improve the quality of context-based suggestions, especially for multiple, conditional, and homogeneous signatures. (CASE, COALESCE, BUCKETS)
  • prediction of upcoming fields based on signature combinations (already present but improved)
  • Resolve old ambiguities present in old versions that cause errors in queries (such as suggesting Integer fields when requesting constant integers)
  • Strengthen the suggest-for-expression mechanism.
  • Improve the quality of suggestion tests.

Important notes:

  • We've currently limited TEXT/KEYWORDS suggestions (only inside functions) to operators like is (not) null, like, etc., removing comparison operators. This point need a bit clarification because theoretically it does make sense suggest comparison operators.
    • STATS doesn't have a location for those operators either, so for example STATS COUNT_DISTINCT(agent only shows the comma and not operators like is (not) null
  • We now distinguish between integers (long, double) and constants, which means that in some cases we don't suggest anything. For example, ROUNT_TO has an optional precision field that must be a numeric constant. Previously, we also suggested numeric fields (like bytes), which made the Elastic query invalid. This case can work well with PR [ES|QL] show function signature while typing #180528

some examples

**Only Constant and variadic tests **
FROM kibana_sample_data_logs | EVAL ROUND_TO(bytes, // we don't suggest nothing because we expect numeric constants
FROM kibana_sample_data_logs | EVAL ROUND_TO(bytes, 3 // we know that this function has max 2 parameters and we suggest only math opearators, because they are the only ones that make sense
FROM kibana_sample_data_logs | EVAL CONCAT(agent, agent.keywords // CONCAT has min 2 params but it can continue, so we suggest a comma

Homogenity tests
FROM kibana_sample_data_logs | EVAL COALESCE(bytes // Here we suggest numeric, comma and comparison operators (because we can transform it into boolean

FROM kibana_sample_data_logs | EVAL COALESCE(bytes, // After the comma we know that the subsequent types must be numeric and we suggest only these (and subsequently the operators that make sense)

Conditional cases
FROM kibana_sample_data_logs | EVAL CASE(bytes // The first parameter must be a complete Boolean expression, so I don't have to suggest a comma. I suggest a comma only when the right operand also exists.

Multisignatures
FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY BUCKETS(bytes, // It's a number and I only expect a second constant parameter, so I don't suggest anything

FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY BUCKETS(timestamp, // Here I suggest 1 month, 1 year... because it could be a date or a numeric constant (no suggestion). If I select 1 month, then I don't suggest anything because the signature says I have a maximum of 2 parameters.

FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY BUCKETS(timestamp, 3, // Here the second parameter is a constant, so the rules change. I unlock parameters 3 and 4 with the possible (constant) values: ?start . ?end, select from datepicker

@bartoval bartoval requested a review from a team as a code owner October 29, 2025 09:08
@bartoval bartoval self-assigned this Oct 29, 2025
@bartoval bartoval added Feature:ES|QL ES|QL related features in Kibana Team:ESQL ES|QL related features in Kibana t// labels Oct 29, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-esql (Team:ESQL)

@bartoval bartoval added backport:skip This PR does not require backporting release_note:enhancement v9.3.0 labels Oct 29, 2025
@bartoval
Copy link
Contributor Author

One of the purposes of this PR is also to manually test to see if something has been forgotten or not totally correct

Copy link
Contributor

@stratoula stratoula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great and I could not find any bug, only better experience ❤️

I just have some questions that I would like you to answer first

};

// Re-export operator groups for use in tests (avoid hardcoding operator names)
export {
Copy link
Contributor

@stratoula stratoula Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you exporting here again? We have them already here ../definitions/all_operators. Why dont you use the existings exports for the tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks,
At the beginning it was hardcoded and I had all the references to this structure, then I changed it but I forgot to delete it and update

],
mockCallbacks
);
await evalExpectSuggestions('from a | eval a=round(doubleField, ', [], mockCallbacks);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice


test('suggests operators after initial column based on type', async () => {
// case( field ) suggests all appropriate operators for that field type
// Note: CASE is expression-heavy, comma is not automatically suggested after fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also has this bug (but it happens in main so it is not because of this PR)

image

We should open an issue to track it

expect(suggestions).toContainEqual(expectedCompletionItem);
});

test('should NOT appear inside BUCKET function', async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me some time to understand. This is a suggestion, change it if you dont like it but we need to adjust it

Suggested change
test('should NOT appear inside BUCKET function', async () => {
test('BUCKET constant arguments should not trigger function suggestions', async () => {
expect(duplicates).toEqual([]);
});

test('BUCKET with numeric field should NOT show col0 or date histogram at second param', async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing

});

it('conditional with text field: suggests comparison/pattern/IN/null operators in STATS', async () => {
it.skip('conditional with text field: suggests pattern/IN/null operators (not comparison) in STATS', async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you skip this poor test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because he failed, I'm touchy and I punished him. :D

Now I have forgiven him

Copy link
Contributor

@sddonne sddonne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks amazing! these are really nice improvements ❤️

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY BUCKETS(timestamp, 3, // Here the second parameter is a constant, so the rules change. I unlock parameters 3 and 4 with the possible (constant) values: ?start . ?end, select from datepicker

Not sure if it's simple enough to achieve, but would be nice that BUCKET suggests date functions for the last 2 args. To be able to write querys like this:

FROM ebt-kibana-browser | STATS BY BUCKET(context.cloudTrialEndDate, 30, "2012-10-15T11:27:38.230Z", NOW())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean when you press Add date histogram?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok you are referring to NOW()

Copy link
Contributor

@stratoula stratoula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanx Val!

@bartoval bartoval force-pushed the improve_suggestions branch from 5a89150 to cb6092c Compare October 29, 2025 17:29
@elasticmachine
Copy link
Contributor

elasticmachine commented Oct 29, 2025

💔 Build Failed

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #71 / discover/tabs discover - ES|QL controls should add an ES|QL multi - value control
  • [job] [logs] FTR Configs #71 / discover/tabs discover - ES|QL controls should add an ES|QL multi - value control
  • [job] [logs] Scout: [ platform / discover_enhanced ] plugin / stateful - Discover app - saved searches - should unselect saved search when navigating to a 'new'

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
esql 578.3KB 578.8KB +480.0B

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
kbnUiSharedDeps-srcJs 4.0MB 4.0MB +2.9KB

History

cc @bartoval

@bartoval bartoval force-pushed the improve_suggestions branch from f9359e4 to 9d9cebb Compare October 29, 2025 22:17
@bartoval bartoval merged commit 7582d50 into elastic:main Oct 30, 2025
12 checks passed
sbelastic pushed a commit to sbelastic/kibana that referenced this pull request Oct 30, 2025
…elastic#241081)

## Summary
elastic#239507

This PR has a multiple purpose:
- Improve the quality of context-based suggestions, especially for
multiple, conditional, and homogeneous signatures. (CASE, COALESCE,
BUCKETS)
- prediction of upcoming fields based on signature combinations (already
present but improved)
- Resolve old ambiguities present in old versions that cause errors in
queries (such as suggesting Integer fields when requesting constant
integers)
- Strengthen the suggest-for-expression mechanism.
- Improve the quality of suggestion tests.

**Important notes:**
- We've currently limited TEXT/KEYWORDS suggestions (only inside
functions) to operators like is (not) null, like, etc., removing
comparison operators. This point need a bit clarification because
theoretically it does make sense suggest comparison operators.
- **STATS** doesn't have a location for those operators either, so for
example` STATS COUNT_DISTINCT(agent` only shows the comma and not
operators like is (not) null
- We now distinguish between integers (long, double) and constants,
which means that in some cases **we don't suggest** anything. For
example, ROUNT_TO has an optional precision field that must be a numeric
constant. Previously, we also suggested numeric fields (like bytes),
which made the Elastic query invalid. This case can work well with PR
elastic#180528

**some examples**

**Only Constant and variadic tests **
`FROM kibana_sample_data_logs | EVAL ROUND_TO(bytes,` // we don't
suggest nothing because we expect numeric constants
`FROM kibana_sample_data_logs | EVAL ROUND_TO(bytes, 3` // we know that
this function has max 2 parameters and we suggest only math opearators,
because they are the only ones that make sense
`FROM kibana_sample_data_logs | EVAL CONCAT(agent, agent.keywords` //
CONCAT has min 2 params but it can continue, so we suggest a comma

**Homogenity tests**
`FROM kibana_sample_data_logs | EVAL COALESCE(bytes` // Here we suggest
numeric, comma and comparison operators (because we can transform it
into boolean

`FROM kibana_sample_data_logs | EVAL COALESCE(bytes,` // After the comma
we know that the subsequent types must be numeric and we suggest only
these (and subsequently the operators that make sense)

**Conditional cases**
`FROM kibana_sample_data_logs | EVAL CASE(bytes` // The first parameter
must be a complete Boolean expression, so I don't have to suggest a
comma. I suggest a comma only when the right operand also exists.

**Multisignatures**
`FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY
BUCKETS(bytes,` // It's a number and I only expect a second constant
parameter, so I don't suggest anything

`FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY
BUCKETS(timestamp,` // Here I suggest 1 month, 1 year... because it
could be a date or a numeric constant (no suggestion). If I select 1
month, then I don't suggest anything because the signature says I have a
maximum of 2 parameters.

`FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY
BUCKETS(timestamp, 3, ` // Here the second parameter is a constant, so
the rules change. I unlock parameters 3 and 4 with the possible
(constant) values: ?start . ?end, select from datepicker
ana-davydova pushed a commit to ana-davydova/kibana that referenced this pull request Nov 3, 2025
…elastic#241081)

## Summary
elastic#239507

This PR has a multiple purpose:
- Improve the quality of context-based suggestions, especially for
multiple, conditional, and homogeneous signatures. (CASE, COALESCE,
BUCKETS)
- prediction of upcoming fields based on signature combinations (already
present but improved)
- Resolve old ambiguities present in old versions that cause errors in
queries (such as suggesting Integer fields when requesting constant
integers)
- Strengthen the suggest-for-expression mechanism.
- Improve the quality of suggestion tests.

**Important notes:**
- We've currently limited TEXT/KEYWORDS suggestions (only inside
functions) to operators like is (not) null, like, etc., removing
comparison operators. This point need a bit clarification because
theoretically it does make sense suggest comparison operators.
- **STATS** doesn't have a location for those operators either, so for
example` STATS COUNT_DISTINCT(agent` only shows the comma and not
operators like is (not) null
- We now distinguish between integers (long, double) and constants,
which means that in some cases **we don't suggest** anything. For
example, ROUNT_TO has an optional precision field that must be a numeric
constant. Previously, we also suggested numeric fields (like bytes),
which made the Elastic query invalid. This case can work well with PR
elastic#180528

**some examples**

**Only Constant and variadic tests **
`FROM kibana_sample_data_logs | EVAL ROUND_TO(bytes,` // we don't
suggest nothing because we expect numeric constants
`FROM kibana_sample_data_logs | EVAL ROUND_TO(bytes, 3` // we know that
this function has max 2 parameters and we suggest only math opearators,
because they are the only ones that make sense
`FROM kibana_sample_data_logs | EVAL CONCAT(agent, agent.keywords` //
CONCAT has min 2 params but it can continue, so we suggest a comma

**Homogenity tests**
`FROM kibana_sample_data_logs | EVAL COALESCE(bytes` // Here we suggest
numeric, comma and comparison operators (because we can transform it
into boolean

`FROM kibana_sample_data_logs | EVAL COALESCE(bytes,` // After the comma
we know that the subsequent types must be numeric and we suggest only
these (and subsequently the operators that make sense)

**Conditional cases**
`FROM kibana_sample_data_logs | EVAL CASE(bytes` // The first parameter
must be a complete Boolean expression, so I don't have to suggest a
comma. I suggest a comma only when the right operand also exists.

**Multisignatures**
`FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY
BUCKETS(bytes,` // It's a number and I only expect a second constant
parameter, so I don't suggest anything

`FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY
BUCKETS(timestamp,` // Here I suggest 1 month, 1 year... because it
could be a date or a numeric constant (no suggestion). If I select 1
month, then I don't suggest anything because the signature says I have a
maximum of 2 parameters.

`FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY
BUCKETS(timestamp, 3, ` // Here the second parameter is a constant, so
the rules change. I unlock parameters 3 and 4 with the possible
(constant) values: ?start . ?end, select from datepicker
albertoblaz pushed a commit to albertoblaz/kibana that referenced this pull request Nov 4, 2025
…elastic#241081)

## Summary
elastic#239507

This PR has a multiple purpose:
- Improve the quality of context-based suggestions, especially for
multiple, conditional, and homogeneous signatures. (CASE, COALESCE,
BUCKETS)
- prediction of upcoming fields based on signature combinations (already
present but improved)
- Resolve old ambiguities present in old versions that cause errors in
queries (such as suggesting Integer fields when requesting constant
integers)
- Strengthen the suggest-for-expression mechanism.
- Improve the quality of suggestion tests.

**Important notes:**
- We've currently limited TEXT/KEYWORDS suggestions (only inside
functions) to operators like is (not) null, like, etc., removing
comparison operators. This point need a bit clarification because
theoretically it does make sense suggest comparison operators.
- **STATS** doesn't have a location for those operators either, so for
example` STATS COUNT_DISTINCT(agent` only shows the comma and not
operators like is (not) null
- We now distinguish between integers (long, double) and constants,
which means that in some cases **we don't suggest** anything. For
example, ROUNT_TO has an optional precision field that must be a numeric
constant. Previously, we also suggested numeric fields (like bytes),
which made the Elastic query invalid. This case can work well with PR
elastic#180528

**some examples**

**Only Constant and variadic tests **
`FROM kibana_sample_data_logs | EVAL ROUND_TO(bytes,` // we don't
suggest nothing because we expect numeric constants
`FROM kibana_sample_data_logs | EVAL ROUND_TO(bytes, 3` // we know that
this function has max 2 parameters and we suggest only math opearators,
because they are the only ones that make sense
`FROM kibana_sample_data_logs | EVAL CONCAT(agent, agent.keywords` //
CONCAT has min 2 params but it can continue, so we suggest a comma

**Homogenity tests**
`FROM kibana_sample_data_logs | EVAL COALESCE(bytes` // Here we suggest
numeric, comma and comparison operators (because we can transform it
into boolean

`FROM kibana_sample_data_logs | EVAL COALESCE(bytes,` // After the comma
we know that the subsequent types must be numeric and we suggest only
these (and subsequently the operators that make sense)

**Conditional cases**
`FROM kibana_sample_data_logs | EVAL CASE(bytes` // The first parameter
must be a complete Boolean expression, so I don't have to suggest a
comma. I suggest a comma only when the right operand also exists.

**Multisignatures**
`FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY
BUCKETS(bytes,` // It's a number and I only expect a second constant
parameter, so I don't suggest anything

`FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY
BUCKETS(timestamp,` // Here I suggest 1 month, 1 year... because it
could be a date or a numeric constant (no suggestion). If I select 1
month, then I don't suggest anything because the signature says I have a
maximum of 2 parameters.

`FROM kibana_sample_data_logs | STATS COUNT_DISTINCT(agent) BY
BUCKETS(timestamp, 3, ` // Here the second parameter is a constant, so
the rules change. I unlock parameters 3 and 4 with the possible
(constant) values: ?start . ?end, select from datepicker
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting Feature:ES|QL ES|QL related features in Kibana release_note:enhancement Team:ESQL ES|QL related features in Kibana t// v9.3.0

4 participants