Add multi_match function #121525#125062
Add multi_match function #121525#125062svilen-mihaylov-elastic merged 96 commits intoelastic:mainfrom
Conversation
|
TODO: add tests |
carlosdelest
left a comment
There was a problem hiding this comment.
This is looking good! 💯
Things we need to check:
- QueryBuilder serialization / deserialization
- Testing:
- CSV tests, and include scoring test in
scoring.csv - Add this function to the
ScoringITtests so scoring is checked - The usual suspects:
VerifierTests,LocalPhysicalPlanOptimizerTests,MultiMatchTests(new),LogicalPlanOptimizerTests)
- CSV tests, and include scoring test in
...sql/src/main/java/org/elasticsearch/xpack/esql/expression/function/EsqlFunctionRegistry.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/FullTextWritables.java
Outdated
Show resolved
Hide resolved
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MultiMatch.java
Outdated
Show resolved
Hide resolved
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MultiMatch.java
Outdated
Show resolved
Hide resolved
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MultiMatch.java
Show resolved
Hide resolved
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MultiMatch.java
Outdated
Show resolved
Hide resolved
…expression/function/fulltext/MultiMatch.java Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
carlosdelest
left a comment
There was a problem hiding this comment.
A question on the handling of options, and some final testing that I'd like us to have. The rest LGTM 🙌
...ql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LocalPhysicalPlanOptimizerTests.java
Show resolved
Hide resolved
| } | ||
|
|
||
| public void testMultiMatchOptionsPushDown() { | ||
| String query = """ |
There was a problem hiding this comment.
I still don't see validation that the query retrieves options for zero_terms_query or boost - they should be easy to add and we get complete coverage that all options work as expected
x-pack/plugin/esql/qa/testFixtures/src/main/resources/multi-match-function.csv-spec
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/qa/testFixtures/src/main/resources/multi-match-function.csv-spec
Outdated
Show resolved
Hide resolved
| required_capability: metadata_score | ||
|
|
||
| from books metadata _score | ||
| | where multi_match("Hobbit", description, title, {"type": "most_fields"}) |
There was a problem hiding this comment.
Nice work on combining the scores via most_fields and best_fields 👏
x-pack/plugin/esql/qa/testFixtures/src/main/resources/multi-match-function.csv-spec
Show resolved
Hide resolved
...ugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/Match.java
Outdated
Show resolved
Hide resolved
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MultiMatch.java
Show resolved
Hide resolved
| } | ||
|
|
||
| private MultiMatch(Source source, Expression query, List<Expression> fields, Expression options, QueryBuilder queryBuilder) { | ||
| super(source, query, initChildren(query, fields, options), queryBuilder); |
There was a problem hiding this comment.
I wonder if this would be better handled by just having a List fieldsAndOptions, instead of separating the two. This seems to be making us check for options in a variety of places and do some gymnastics for the constructor.
We could keep a single List and then check for the last element in the options() field to return that or not.
There was a problem hiding this comment.
The reason why I wanted to keep them separate was because of the way we use the @interfaces like @MapParam from the constructor arguments to generate the docs.
If all we had was a list of children, that might mean changing the way the docs are generated for functions?
I am not sure that would be less awkward - neither way seems ideal.
There was a problem hiding this comment.
The docs generation comment is spot on. Let's keep it this way and iterate if needed.
…tch-function.csv-spec Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
…optimizer/LocalPhysicalPlanOptimizerTests.java Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
carlosdelest
left a comment
There was a problem hiding this comment.
Thank you for iterating on this, @svilen-mihaylov-elastic !
I'd like to have the additional tests for phrase and phrase-prefix included, but this LGTM. Great work! 💯
Is there anything you need on top of this? 937963b |
| @@ -1,6 +1,5 @@ | |||
| % This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it. | |||
|
|
|||
| ### DATE TRUNC | |||
There was a problem hiding this comment.
this looks like an unintended change - let's revert it
| assertEquals(DataType.DOUBLE, ee.dataType()); | ||
| } | ||
|
|
||
| public void testFunctionNamedParamsAsFunctionArgument1() { |
There was a problem hiding this comment.
as a side note I don't see a difference between this test and the one we add to AnalyzerTests.
I guess we just follow an existing pattern - but as a follow up, we should discuss if both are needed.
Implement multi_match function for ESQL. Its currently available on snapshot builds pending refinement of the syntax.
|
I notice that the examples/multi_match.md file contains an example that does not match the example in the csv-spec file. Presumably the csv-spec was changed later, without re-generating the examples. The consequence is that someone else, later, will regenerate examples and get a surprise multi-match change to their PR. |
tracked in #121525