Implement comprehensive top N parameter handling for text similarity reranker#142039
Implement comprehensive top N parameter handling for text similarity reranker#142039Mikep86 merged 29 commits intoelastic:mainfrom
Conversation
…ing to hide the top N setting.
|
Hi @Mikep86, I've created a changelog YAML for you. |
| return new RankedDocsResults(results.stream().sorted(Comparator.reverseOrder()).toList()); | ||
|
|
||
| // RankedDoc's compareTo implementation already sorts by score descending, so we don't need to reverse the sort order | ||
| var sortedResultsStream = results.stream().sorted(); |
There was a problem hiding this comment.
So it turns out that our test reranker service has been sorting docs in reverse relevance this whole time 🫠 . We didn't catch it as an issue because ES sorts the results again (correctly) here. This all happens to work, as long as reranker reranks every doc sent to it. However, if the reranker truncates results (like if a top_n parameter is applied 😉 ), then the incorrect sort order becomes an issue.
| // This method relies on callers filtering out feature docs with null feature data | ||
| assert Arrays.stream(featureDocs).noneMatch(featureDoc -> featureDoc.featureData == null); |
There was a problem hiding this comment.
We used to filter out feature docs with null featureData before generating the inference request, however there are two issues with that:
- In all code paths, the caller already does this, so it's a redundant operation.
- We don't consistently use the filtered feature doc array in
computeScores. So, if a caller were to actually provide afeatureDocsarray with one or more docs with nullfeatureData, we would still end up throwing an NPE somewhere incomputeScores.
This assertion seems like a good balance between being explicit that callers must provide docs with non-null featureData and avoiding redundant operations.
|
@elasticmachine update branch |
kderusso
left a comment
There was a problem hiding this comment.
I have some concerns about backporting and immediately performing a followup but will defer to your team on that. Otherwise the changes look good to me, thanks for doing the cleanup!
| chunkScorerConfig != null | ||
| ? new ChunkScorerConfig(chunkScorerConfig.size, inferenceText, chunkScorerConfig.chunkingSettings()) | ||
| : null | ||
| failuresAllowed |
| configuredTopN = topNProvider.getTopN(); | ||
| } | ||
| } | ||
| if (configuredTopN != null && configuredTopN < rankWindowSize) { |
There was a problem hiding this comment.
I like this suggestion. Talked about this a little with @Mikep86 offline, but I do have some concerns about backporting this change, and then immediately removing it for the better fix proposed here. That strategy will result in merge pain going forward for all future backports that touch this code.
I wonder if there's a way we could backport the change in a way that wouldn't cause a 5xx error in serverless but also not require the merge hell. I realize that manually enumerating top n supporting services isn't ideal but for the backport branches, now that we know the fix could we literally catch the OOB exception and return a friendlier message, then just go with this proposed fix going forward?
| List<Integer> rankedDocToFeatureDoc = new ArrayList<>(); | ||
| for (int i = 0; i < featureDocs.length; i++) { | ||
| RankFeatureDoc featureDoc = featureDocs[i]; | ||
| for (int j = 0; j < featureDoc.featureData.size(); j++) { |
There was a problem hiding this comment.
Would it make sense to check anyway for safety?
kderusso
left a comment
There was a problem hiding this comment.
I have some concerns about backporting and immediately performing a followup but will defer to your team on that. Otherwise the changes look good to me, thanks for doing the cleanup!
…ep86/elasticsearch into text-similarity-reranker_aioob-error
💔 Backport failed
You can use sqren/backport to manually backport by running |
…reranker (elastic#142039) Currently, the text similarity reranker has, at best, spotty support for rerank services that can have a top_n (or similar) parameter set. This parameter configures the rerank service to return, at most, N reranked documents. When N is less than rank_window_size, it can lead to ArrayIndexOutOfBoundsExceptions in the current implementation. This is why we check that N is greater than or equal to rank_window_size, however the problem is that that the current check misses a bunch of rerank services that support top_n. This PR addresses the problem by implementing comprehensive support for the top_n parameter: - Adds the TopNProvider interface, a unified way to report the value of the top_n parameter. - Implements unified and more robust logic for extracting scores from ranked docs. This logic also implements a friendlier failure mode when the number of ranked docs returned by the rerank service does not match what is expected, which can happen when a top_n parameter value is applied, but not properly reported. TextSimilarityRankFeaturePhaseRankCoordinatorContext and its tests have also been simplified. With the unified score extraction logic, we no longer need to resolve chunk scoring configuration while computing scores, so that logic (and its related tests) have been removed. (cherry picked from commit 628c78d) # Conflicts: # x-pack/plugin/inference/qa/test-service-plugin/src/main/java/org/elasticsearch/xpack/inference/mock/TestRerankingServiceExtension.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContext.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContextTests.java # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/70_text_similarity_rank_retriever.yml
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…reranker (elastic#142039) Currently, the text similarity reranker has, at best, spotty support for rerank services that can have a top_n (or similar) parameter set. This parameter configures the rerank service to return, at most, N reranked documents. When N is less than rank_window_size, it can lead to ArrayIndexOutOfBoundsExceptions in the current implementation. This is why we check that N is greater than or equal to rank_window_size, however the problem is that that the current check misses a bunch of rerank services that support top_n. This PR addresses the problem by implementing comprehensive support for the top_n parameter: - Adds the TopNProvider interface, a unified way to report the value of the top_n parameter. - Implements unified and more robust logic for extracting scores from ranked docs. This logic also implements a friendlier failure mode when the number of ranked docs returned by the rerank service does not match what is expected, which can happen when a top_n parameter value is applied, but not properly reported. TextSimilarityRankFeaturePhaseRankCoordinatorContext and its tests have also been simplified. With the unified score extraction logic, we no longer need to resolve chunk scoring configuration while computing scores, so that logic (and its related tests) have been removed. (cherry picked from commit 628c78d) # Conflicts: # x-pack/plugin/inference/qa/test-service-plugin/src/main/java/org/elasticsearch/xpack/inference/mock/TestRerankingServiceExtension.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContext.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/openshiftai/rerank/OpenShiftAiRerankTaskSettings.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContextTests.java # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/70_text_similarity_rank_retriever.yml
…arity reranker (#142039) (#142314) * Implement comprehensive top N parameter handling for text similarity reranker (#142039) Currently, the text similarity reranker has, at best, spotty support for rerank services that can have a top_n (or similar) parameter set. This parameter configures the rerank service to return, at most, N reranked documents. When N is less than rank_window_size, it can lead to ArrayIndexOutOfBoundsExceptions in the current implementation. This is why we check that N is greater than or equal to rank_window_size, however the problem is that that the current check misses a bunch of rerank services that support top_n. This PR addresses the problem by implementing comprehensive support for the top_n parameter: - Adds the TopNProvider interface, a unified way to report the value of the top_n parameter. - Implements unified and more robust logic for extracting scores from ranked docs. This logic also implements a friendlier failure mode when the number of ranked docs returned by the rerank service does not match what is expected, which can happen when a top_n parameter value is applied, but not properly reported. TextSimilarityRankFeaturePhaseRankCoordinatorContext and its tests have also been simplified. With the unified score extraction logic, we no longer need to resolve chunk scoring configuration while computing scores, so that logic (and its related tests) have been removed. (cherry picked from commit 628c78d) # Conflicts: # x-pack/plugin/inference/qa/test-service-plugin/src/main/java/org/elasticsearch/xpack/inference/mock/TestRerankingServiceExtension.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContext.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContextTests.java # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/70_text_similarity_rank_retriever.yml * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…arity reranker (#142039) (#142317) * Implement comprehensive top N parameter handling for text similarity reranker (#142039) Currently, the text similarity reranker has, at best, spotty support for rerank services that can have a top_n (or similar) parameter set. This parameter configures the rerank service to return, at most, N reranked documents. When N is less than rank_window_size, it can lead to ArrayIndexOutOfBoundsExceptions in the current implementation. This is why we check that N is greater than or equal to rank_window_size, however the problem is that that the current check misses a bunch of rerank services that support top_n. This PR addresses the problem by implementing comprehensive support for the top_n parameter: - Adds the TopNProvider interface, a unified way to report the value of the top_n parameter. - Implements unified and more robust logic for extracting scores from ranked docs. This logic also implements a friendlier failure mode when the number of ranked docs returned by the rerank service does not match what is expected, which can happen when a top_n parameter value is applied, but not properly reported. TextSimilarityRankFeaturePhaseRankCoordinatorContext and its tests have also been simplified. With the unified score extraction logic, we no longer need to resolve chunk scoring configuration while computing scores, so that logic (and its related tests) have been removed. (cherry picked from commit 628c78d) # Conflicts: # x-pack/plugin/inference/qa/test-service-plugin/src/main/java/org/elasticsearch/xpack/inference/mock/TestRerankingServiceExtension.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContext.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/openshiftai/rerank/OpenShiftAiRerankTaskSettings.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContextTests.java # x-pack/plugin/inference/src/yamlRestTest/resources/rest-api-spec/test/inference/70_text_similarity_rank_retriever.yml * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…reranker (elastic#142039) Currently, the text similarity reranker has, at best, spotty support for rerank services that can have a top_n (or similar) parameter set. This parameter configures the rerank service to return, at most, N reranked documents. When N is less than rank_window_size, it can lead to ArrayIndexOutOfBoundsExceptions in the current implementation. This is why we check that N is greater than or equal to rank_window_size, however the problem is that that the current check misses a bunch of rerank services that support top_n. This PR addresses the problem by implementing comprehensive support for the top_n parameter: - Adds the TopNProvider interface, a unified way to report the value of the top_n parameter. - Implements unified and more robust logic for extracting scores from ranked docs. This logic also implements a friendlier failure mode when the number of ranked docs returned by the rerank service does not match what is expected, which can happen when a top_n parameter value is applied, but not properly reported. TextSimilarityRankFeaturePhaseRankCoordinatorContext and its tests have also been simplified. With the unified score extraction logic, we no longer need to resolve chunk scoring configuration while computing scores, so that logic (and its related tests) have been removed.
Currently, the text similarity reranker has, at best, spotty support for rerank services that can have a
top_n(or similar) parameter set. This parameter configures the rerank service to return, at most,Nreranked documents. WhenNis less thanrank_window_size, it can lead toArrayIndexOutOfBoundsExceptions in the current implementation. This is why we check thatNis greater than or equal torank_window_size, however the problem is that that the current check misses a bunch of rerank services that supporttop_n.This PR addresses the problem by implementing comprehensive support for the
top_nparameter:TopNProviderinterface, a unified way to report the value of thetop_nparameter.top_nparameter value is applied, but not properly reported.TextSimilarityRankFeaturePhaseRankCoordinatorContextand its tests have also been simplified. With the unified score extraction logic, we no longer need to resolve chunk scoring configuration while computing scores, so that logic (and its related tests) have been removed.