Weights in simplified RRF retriever cleanup#135040
Conversation
|
Pinging @elastic/search-relevance (Team:Search - Relevance) |
| "Basic per-field boosting using the simplified format": | ||
| - requires: | ||
| cluster_features: ["rrf_retriever.simplified_weighted_support"] | ||
| reason: "Simplified weighted fields syntax support" | ||
|
|
||
| - do: | ||
| search: | ||
| index: test-index | ||
| body: | ||
| retriever: | ||
| rrf: | ||
| fields: [ "text_1", "text_2^2" ] | ||
| query: "foo" | ||
|
|
||
| # With weighted fields, verify basic functionality | ||
| - gte: { hits.total.value: 1 } | ||
| - length: { hits.hits: 1 } | ||
| # Verify that text_2^2 affects ranking (basic smoke test) |
There was a problem hiding this comment.
This test is redundant, as we already have a test that applies per-field boosts to lexical matches and checks for changes in result order. Additionally, the assertions performed in this test do not meaningfully check that the boost actually was applied.
| "Semantic field weighting": | ||
| - requires: | ||
| cluster_features: ["rrf_retriever.simplified_weighted_support"] | ||
| reason: "Simplified weighted fields syntax support" | ||
|
|
||
| - do: | ||
| search: | ||
| index: test-index | ||
| body: | ||
| retriever: | ||
| rrf: | ||
| fields: ["dense_inference^2", "sparse_inference^1.5"] | ||
| query: "elasticsearch" | ||
|
|
||
| - match: { hits.total.value: 3 } | ||
| - length: { hits.hits: 3 } |
There was a problem hiding this comment.
Same as https://github.com/elastic/elasticsearch/pull/135040/files#r2361285845, but for boosts on semantic matches.
| "Zero weight handling": | ||
| - requires: | ||
| cluster_features: ["rrf_retriever.simplified_weighted_support"] | ||
| reason: "Simplified weighted fields syntax support" | ||
|
|
||
| - do: | ||
| search: | ||
| index: test-index | ||
| body: | ||
| retriever: | ||
| rrf: | ||
| fields: ["text_1^0", "text_2^1"] | ||
| query: "foo" | ||
|
|
||
| - gte: { hits.total.value: 1 } |
There was a problem hiding this comment.
We already have coverage for this in the rewrite tests
There was a problem hiding this comment.
RRF scores are generally not meaningful, hence why the existing tests did not check them. Moreover, the way they were being checked did not add any real value.
|
@elasticmachine update branch |
kderusso
left a comment
There was a problem hiding this comment.
Test looks good to me. The one piece of feedback that I have for the unit tests, is that it would be really helpful to add comments to the tests explaining why the scores are expected. Just to make it a little readable and parseable. Maybe that's something @mridula-s109 could help with in a followup if we agree it would be useful?
Good point - adding comments to explain the expected scores would definitely make the tests more readable. I’m happy to take that up in a follow-up PR if everyone agrees. |
Do you mean the per-field boosts? We don't check document scores in the unit tests. |
Sorry, yes, e.g. quickly explain why we would expect a boost of 3.75 on a combined field |
Cleans up tests added in #132680. The rewrite tests in
LinearRetrieverBuilderTestsandRRFRetrieverBuilderTestsare now aligned, testing the same scenarios in the same order. This should make these tests easier to maintain. The YAML tests have been adjusted to remove redundant tests and score checks that did not add value.