Semantic_text match_all with Highlighter#128702
Semantic_text match_all with Highlighter#128702Samiul-TheSoccerFan merged 17 commits intoelastic:mainfrom
Conversation
kderusso
left a comment
There was a problem hiding this comment.
Nice start, I've added some high level comments on overall organization. We can do a deeper review when it's closer to ready.
.../java/org/elasticsearch/xpack/inference/queries/SemanticMatchAllQueryRewriteInterceptor.java
Outdated
Show resolved
Hide resolved
.../java/org/elasticsearch/xpack/inference/queries/SemanticMatchAllQueryRewriteInterceptor.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/elasticsearch/xpack/inference/queries/SemanticQueryRewriteInterceptor.java
Outdated
Show resolved
Hide resolved
Mikep86
left a comment
There was a problem hiding this comment.
There's a much simpler way to fix this by updating the semantic highlighter. Take a look at how you can change QueryVisitor in extractDenseVectorQueries and extractSparseVectorQueries: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/highlight/SemanticTextHighlighter.java#L250
|
@elasticmachine update branch |
|
Pinging @elastic/search-eng (Team:SearchOrg) |
|
Pinging @elastic/search-relevance (Team:Search - Relevance) |
|
Hi @Samiul-TheSoccerFan, I've created a changelog YAML for you. |
kderusso
left a comment
There was a problem hiding this comment.
Thanks for these changes, looks great @Samiul-TheSoccerFan !
docs/changelog/128702.yaml
Outdated
| @@ -0,0 +1,5 @@ | |||
| pr: 128702 | |||
| summary: Highlighting support added to semantic_text with `match_all` | |||
There was a problem hiding this comment.
Nitpick: Since this is a bug, perhaps we call the changelog summary something like `Fix issue where highlighting was not returned with match_all queries"
| queries.add(fieldType.createExactKnnQuery(VectorData.fromFloats(knnQuery.getTargetCopy()), null)); | ||
| } else if (query instanceof KnnByteVectorQuery knnQuery) { | ||
| queries.add(fieldType.createExactKnnQuery(VectorData.fromBytes(knnQuery.getTargetCopy()), null)); | ||
| } else if (query instanceof MatchAllDocsQuery) { |
There was a problem hiding this comment.
😍 Nice optimization from the first solution!
|
@elasticmachine update branch |
Mikep86
left a comment
There was a problem hiding this comment.
Approach looks good! Caught a copy/paste error in the tests that we should fix though.
...nce/src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter.yml
Outdated
Show resolved
Hide resolved
...src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter_bwc.yml
Outdated
Show resolved
Hide resolved
...nce/src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter.yml
Outdated
Show resolved
Hide resolved
...src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter_bwc.yml
Outdated
Show resolved
Hide resolved
Mikep86
left a comment
There was a problem hiding this comment.
LGTM! Good clean solution and tests :)
|
@elasticmachine update branch |
|
@elasticmachine update branch |
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
* initial implementation for match_All * reformat * [CI] Auto commit changes from spotless * Excluding matchAllintercepter * Adding matchAllDocs support for vector fields * [CI] Auto commit changes from spotless * Remove previous implementation * Adding yaml tests for match_all * fixed yaml tests * Update docs/changelog/128702.yaml * Update changelog * changelog - update summary * Fix wrong inference names for the yaml tests --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit d1b5532) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java
* initial implementation for match_All * reformat * [CI] Auto commit changes from spotless * Excluding matchAllintercepter * Adding matchAllDocs support for vector fields * [CI] Auto commit changes from spotless * Remove previous implementation * Adding yaml tests for match_all * fixed yaml tests * Update docs/changelog/128702.yaml * Update changelog * changelog - update summary * Fix wrong inference names for the yaml tests --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit d1b5532) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java
* initial implementation for match_All * reformat * [CI] Auto commit changes from spotless * Excluding matchAllintercepter * Adding matchAllDocs support for vector fields * [CI] Auto commit changes from spotless * Remove previous implementation * Adding yaml tests for match_all * fixed yaml tests * Update docs/changelog/128702.yaml * Update changelog * changelog - update summary * Fix wrong inference names for the yaml tests --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit d1b5532) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter.yml # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter_bwc.yml
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
* initial implementation for match_All * reformat * [CI] Auto commit changes from spotless * Excluding matchAllintercepter * Adding matchAllDocs support for vector fields * [CI] Auto commit changes from spotless * Remove previous implementation * Adding yaml tests for match_all * fixed yaml tests * Update docs/changelog/128702.yaml * Update changelog * changelog - update summary * Fix wrong inference names for the yaml tests --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit d1b5532) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter.yml # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter_bwc.yml
* initial implementation for match_All * reformat * [CI] Auto commit changes from spotless * Excluding matchAllintercepter * Adding matchAllDocs support for vector fields * [CI] Auto commit changes from spotless * Remove previous implementation * Adding yaml tests for match_all * fixed yaml tests * Update docs/changelog/128702.yaml * Update changelog * changelog - update summary * Fix wrong inference names for the yaml tests --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit d1b5532) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter.yml # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter_bwc.yml
* initial implementation for match_All * reformat * [CI] Auto commit changes from spotless * Excluding matchAllintercepter * Adding matchAllDocs support for vector fields * [CI] Auto commit changes from spotless * Remove previous implementation * Adding yaml tests for match_all * fixed yaml tests * Update docs/changelog/128702.yaml * Update changelog * changelog - update summary * Fix wrong inference names for the yaml tests --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit d1b5532) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter.yml # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter_bwc.yml
This PR addresses the missing
highlightingfor semantic_text fields when amatch_allquery is used. We guide users to use semantic highlighting to view their chunks, but highlighting does not currently return any results withmatch_all. Highlighting is usually unnecessary for non-inference fields withmatch_all, but it's essential forsemantic_textfields to return the generated chunks. This change ensures those highlighting results are returned as expected.Test cases:
Only text fields
only inference fields
Both infer and non-inference fields
Both dense vector fields
Some additional cases