[ES|QL] Allow lookup join on mixed numeric fields#128263
Merged
fang-xing-esql merged 9 commits intoelastic:mainfrom May 25, 2025
Merged
[ES|QL] Allow lookup join on mixed numeric fields#128263fang-xing-esql merged 9 commits intoelastic:mainfrom
fang-xing-esql merged 9 commits intoelastic:mainfrom
Conversation
Collaborator
|
Hi @fang-xing-esql, I've created a changelog YAML for you. |
fang-xing-esql
commented
May 21, 2025
| | lookup join message_types_lookup on message | ||
| | rename type as message | ||
| | lookup join message_types_lookup on message | ||
| from languag*, -languages_mixed_numerics |
Member
Author
There was a problem hiding this comment.
The changes to existing tests are:
- Remove trailing spaces
- Exclude the new index from an existing test to keep its results unchanged.
Collaborator
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
costin
approved these changes
May 22, 2025
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/join/Join.java
Outdated
Show resolved
Hide resolved
Member
Author
Thanks for reviewing! |
auto-merge was automatically disabled
May 23, 2025 19:39
Pull Request is not mergeable
Collaborator
💔 Backport failed
You can use sqren/backport to manually backport by running |
Member
Author
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
elasticsearchmachine
pushed a commit
that referenced
this pull request
Jun 3, 2025
* allow lookup join on mixed numeric fields (cherry picked from commit dfe1357) # Conflicts: # x-pack/plugin/esql/qa/testFixtures/src/main/java/org/elasticsearch/xpack/esql/CsvTestsDataLoader.java # x-pack/plugin/esql/qa/testFixtures/src/main/resources/lookup-join.csv-spec # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Previously, lookup joins on numeric fields required that the left-hand side (LHS) and right-hand side (RHS) had exactly the same
ES|QLdata type - eitherinteger,long, ordouble. For example, the following joins were permitted:integerjoininteger- In ES|QL, the ES typesbyte,short, andintegerare all consideredinteger, so joins between any two of them were allowed.longjoinlong- the joins betweenlongs were allowed.doublejoindouble- The ES typeshalf_float,scaled_float,floatanddoubleare all considereddoubleinES|QL, the joins between any two of them are allowed.This PR removes some of those strict type-matching requirements to support for joins on mixed numeric types. Additional test cases(edge cases) have been added in CsvTests to validate this behavior. With this change, the LHS and RHS no longer need to share the exact same
ES|QLdata type, aligning join behavior more closely with the behavior of the==operator.Examples of now-permitted joins include:
byte,short,integer,long) are allowed,integerandlongare allowed to join against each other.byte,short,integer,long) and rational numbers(half_float,scaled_float,float,double) are allowed,integer/longanddoubleare allowed join against each other.Observations during validating the joins on mixed numeric types fields.
byte,short,integerandlongwork as expected, including the edge cases, no unexpected result has been observed, it is also consistent with the behavior of==, the join between two whole numbers should be allowed safely.integer/longto a float/double, specifically when dealing with largeinteger/longvalues.Here is an example, one of the tests added that shows this behavior, and quite a lot of the tests that join between a whole number and a rational number show similar behaviors.
This is the current behavior of
==, 9223372036854775806 and 9223372036854775807 are two differentlongvalues, their correspondingdoublevalue is the same.When long and double are joined together, duplicates can return, the duplicates returned from a join can be explained by the behavior of
==, as it is not always one-to-one match between a whole number and a rational number, it is still a bit surprising to see duplicates in the join results. Lookup join is a left join, we don't expect the same results to return after swapping the join sequence, however we need to be careful about inner joins(in the future), as we expect to get the same results after swapping join sequence.