Auto prefiltering for queries on dense semantic_text fields#138989
Conversation
`knn` queries allow specifying `filters` that will be applied before the knn search. This `pre-filtering` allows the `knn` to return `k` results. If such filters are to applied only after the `knn` executes, then the `knn` returns the `k` matching results but the filters can filter out some of them thus potentially returning fewer than `k` results. `semantic_text` fields can be queried with: DSL - `match` queries - `semantic` queries - `knn` queries ES|QL - `match` queries - `knn` queries For DSL, `knn` queries allow users to specify direct prefilters. However, `match` and `semantic` queries provide no way to do so. Same goes for ES|QL `match`. Noting that ES|QL `KNN` already implements auto pre-filtering where conjunctions are pushed down to the `knn` query as prefilters. This commit implements semantic_text auto pre-filtering for `semantic_text` queries in DSL (`match` and `semantic` queries) and ES|QL (`MATCH`). We achieve this by adding an `AutoPrefilteringScope` object to the `SearchExecutionContext`. When we convert a `bool` query to a lucene query, we push its `must`, `filter`, and `must_not` clauses to the `AutoPrefilteringScope`. At that stage queries have already been rewritten. Semantic queries using `text_embedding` inference endpoints are rewritten to knn vector queries that are auto-prefiltering enabled. Then, when an auto-prefiltering enabled knn vector query is converted to its lucene equivalent, we fetch the prefilters from the `SearchExecutionContext` and we apply them to the knn vector query - which supports pre-filtering already. ES|QL queries that contain `MATCH` automatically benefit from this implementation because they are rewritten in `bool` queries. Limitations DSL - nested queries are excluded from pre-filtering (elastic#138184) ES|QL - filters that are not translatable to lucene queries will be applied as post-filters Relates elastic#132068
|
Pinging @elastic/search-relevance (Team:Search - Relevance) |
|
Hi @dimitris-athanasiou, I've created a changelog YAML for you. |
server/src/main/java/org/elasticsearch/index/query/support/AutoPrefilteringScope.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/query/support/AutoPrefilteringScope.java
Show resolved
Hide resolved
...esql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/SemanticMatchTestCase.java
Show resolved
Hide resolved
Mikep86
left a comment
There was a problem hiding this comment.
Partial review, nice work! I didn't get to the tests, but I reviewed all the production logic. I identified some potential edge cases that I think we could iterate on.
server/src/main/java/org/elasticsearch/index/query/BoolQueryBuilder.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/vectors/KnnVectorQueryBuilder.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/vectors/KnnVectorQueryBuilder.java
Outdated
Show resolved
Hide resolved
| rescoreVectorBuilder, | ||
| vectorSimilarity | ||
| ).boost(boost).queryName(queryName).addFilterQueries(filterQueries); | ||
| ).boost(boost).queryName(queryName).addFilterQueries(filterQueries).setAutoPrefilteringEnabled(isAutoPrefilteringEnabled); |
There was a problem hiding this comment.
There's a bunch of places in the query interception logic where we need to make a copy of the knn query, except slightly tweaked. It's very easy to overlook the need to call setAutoPrefilteringEnabled when making such copies. Maybe it's time for a little static helper method that takes an origin knn query and applies boost, queryName, and autoPrefilteringEnabled values to a target knn query?
There was a problem hiding this comment.
To be fair the constructor situation in KnnVectorQueryBuilder has gone wild and I agree it is very fragile. A static helper would be nice but we'd still need to remember to call it. I wonder if the right solution here is a refactoring of the constructors. How about we leave this is follow up work? I can raise an issue for tidying this up.
There was a problem hiding this comment.
Agreed a more thorough refactoring is needed here, which we can do in a follow-up. IMO we should refactor KnnVectorQueryBuilder to use a builder pattern that can take an existing KnnVectorQueryBuilder to initialize.
|
@Mikep86 I have pushed commits to the PR where I address your feedback:
|
Mikep86
left a comment
There was a problem hiding this comment.
Fantastic work 🙌 ! All the production code looks good. I pointed out a potential edge case in the min-should-match handing, but I don't have a good solution for it (it's also a very narrow edge case). Other than that, it's just a few small adjustments to tests.
server/src/main/java/org/elasticsearch/index/query/support/AutoPrefilteringUtils.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/index/query/BoolQueryBuilderTests.java
Outdated
Show resolved
Hide resolved
...er/src/test/java/org/elasticsearch/search/vectors/AbstractKnnVectorQueryBuilderTestCase.java
Outdated
Show resolved
Hide resolved
.../yamlRestTest/resources/rest-api-spec/test/inference/100_semantic_text_auto_prefiltering.yml
Show resolved
Hide resolved
.../yamlRestTest/resources/rest-api-spec/test/inference/100_semantic_text_auto_prefiltering.yml
Show resolved
Hide resolved
.../yamlRestTest/resources/rest-api-spec/test/inference/100_semantic_text_auto_prefiltering.yml
Show resolved
Hide resolved
| // We need to adjust the minimum should match to account for the pruned clauses. | ||
| // We considered the following approaches: | ||
| // 1. strict approach: set to min(remaining_should_clauses, original_msm) | ||
| // 2. lenient approach: if msm is set and at least one should clause is pruned, prune all should clauses. | ||
| // 3. middle ground approach: set to max(0, original_msm - remaining_should_clauses) | ||
| // Let us imagine a query with 5 should clauses. 2 get pruned. msm is 3. 1 remaining clause matches. | ||
| // Approach 1 would make the entire bool query to not match as we would retain msm of 3 but only 1 clause would match. | ||
| // We do not know whether the pruned clauses would match or not. Thus, this approach seems too restrictive. | ||
| // Approach 2 would mean we prune all should clauses and the query would match, | ||
| // even if none of the remaining should clauses match. | ||
| // Approach 3 would mean we adjust the msm to 3 - 2 = 1. This would mean that the query would match if at least one | ||
| // of the remaining clauses matches. | ||
| // We opt for the lenient approach. It is as if we assume the pruned clauses matched. Seems to be the best compromise. |
There was a problem hiding this comment.
Thank you for the thorough description ❤️
carlosdelest
left a comment
There was a problem hiding this comment.
LGTM, amazing work!
It would be awesome to have a test in SemanticMatchTestCase that checks that ES|QL applies prefiltering - but not needed for this PR
| return Optional.empty(); | ||
| } | ||
|
|
||
| if (query instanceof BoolQueryBuilder boolQuery) { |
| } | ||
|
|
||
| public void testBWCVersionSerialization_GivenAutoPrefiltering() throws IOException { | ||
| for (int i = 0; i < 100; i++) { |
There was a problem hiding this comment.
Why execute this multiple times? Is this a loop for testing the test?
There was a problem hiding this comment.
During my testing I found that I needed that to surface problems faster. It runs pretty fast so I left it in.
There was a problem hiding this comment.
You should use -Dtests.iters= instead 😉 . Let's remove this as you'll get plenty of executions on CI anyway.
There was a problem hiding this comment.
That is true. But if you take a look at AbstractBWCSerializationTestCase you'll see we also do multiple runs there by default. What it helps with is that if someone makes a change that breaks BWC, they might run the tests once, they pass and they think it's all good. Whereas running a bunch of times significantly increases the probability to surface a failure and gives immediate feedback to the dev to fix the issue before getting in CI.
There was a problem hiding this comment.
I was not aware of that, thanks!
Maybe then use NUMBER_OF_TEST_RUNS instead to keep with the pattern? 🤷
There was a problem hiding this comment.
Done in 1a8cdc2
AbstractQueryTestCase has its own such constant, NUMBER_OF_TESTQUERIES
@carlosdelest I have added such a test! It's there! |
@dimitris-athanasiou It indeed is! Isn't that awesome? 😅 🤦 |
Adds documentation for automatic pre-filtering that was introduced in elastic#138989.
Adds documentation for automatic pre-filtering that was introduced in #138989. Co-authored-by: Liam Thompson <leemthompo@gmail.com>
knnqueries allow specifyingfiltersthat will be applied before the knn search. Thispre-filteringallows theknnto returnkresults. If such filters are to applied only after theknnexecutes, then theknnreturns thekmatching results but the filters can filter out some of them thus potentially returning fewer thankresults.semantic_textfields can be queried with:DSL
matchqueriessemanticqueriesknnqueriesES|QL
matchqueriesknnqueriesFor DSL,
knnqueries allow users to specify direct prefilters. However,matchandsemanticqueries provide no way to do so. Same goes for ES|QLMATCH. Noting that ES|QLKNNalready implements auto pre-filtering where conjunctions are pushed down to theknnquery as prefilters.This commit implements semantic_text auto pre-filtering for
semantic_textqueries in DSL (matchandsemanticqueries) and ES|QL (MATCH).We achieve this by adding an
AutoPrefilteringScopeobject to theSearchExecutionContext. When we convert aboolquery to a lucene query, we push itsmust,filter, andmust_notclauses to theAutoPrefilteringScope. At that stage queries have already been rewritten. Semantic queries usingtext_embeddinginference endpoints are rewritten to knn vector queries that are auto-prefiltering enabled. Then, when an auto-prefiltering enabled knn vector query is converted to its lucene equivalent, we fetch the prefilters from theSearchExecutionContextand we apply them to the knn vector query - which supports pre-filtering already.ES|QL queries that contain
MATCHautomatically benefit from this implementation because they are rewritten inboolqueries.Limitations
DSL
ES|QL
Relates #132068