Simplified Linear Retriever by Mikep86 · Pull Request #129200 · elastic/elasticsearch

Mikep86 · 2025-06-10T13:42:49Z

Adds a simplified syntax for the linear retriever:

GET my-index/_search
{
  "retriever": {
    "linear": {
      "fields": ["field_1", "field_2^2"],
      "query": "my awesome query",
      "normalizer": "minmax"
    }
  }
}

fields is optional. If it is not provided, we query the fields defined by the index.query.default_field index setting (which is * by default).

This syntax automatically handles querying a mix of lexical fields (i.e. fields that support lexical search via match) and semantic_text fields. The fields are divided into lexical and semantic groups to create a 50/50 weight distribution between the two in the final score. This is achieved by creating a retriever tree that looks like:

linear
   multi_match on lexical fields
   linear
     match on semantic_text field A
     match on semantic_text field B
     match on semantic_text field C

The end result is a score that ranges between 0-2, with up to 1 coming from the lexical matches and up to 1 coming from the semantic matches.

Common logic for generating the retriever tree is in SimplifiedInnerRetrieverUtils, which will also be used by the simplified rrf retriever (see #128633 for a preview of that).

elasticsearchmachine · 2025-06-10T13:43:19Z

Hi @Mikep86, I've created a changelog YAML for you.

elasticsearchmachine · 2025-06-10T13:43:19Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

elasticsearchmachine · 2025-06-10T13:43:19Z

Pinging @elastic/search-eng (Team:SearchOrg)

elasticsearchmachine · 2025-06-10T13:43:20Z

Pinging @elastic/search-relevance (Team:Search - Relevance)

server/src/main/java/org/elasticsearch/search/retriever/CompoundRetrieverBuilder.java

...lugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/linear/LinearRetrieverBuilder.java

pmpailis · 2025-06-10T15:16:18Z

...lugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/linear/LinearRetrieverBuilder.java

+        RetrieverBuilder rewritten = this;
+
+        ResolvedIndices resolvedIndices = ctx.getResolvedIndices();
+        if (resolvedIndices != null && query != null) {


We can still add a global prefilter on the top-level retriever, right? Should we also account for it when generating the expansions or is this something not to be supported?

That should work as is since rewriting with pre-filters happens after this rewrite logic. I can add a test to confirm.

See b891166 and fc57b61.

A couple of things to follow up on here:

There are a whole class of bugs in the retriever framework right now caused by incomplete copies during rewrite. We should adopt a copy constructor approach (similar to how queries handle copy generation) to address all of these issues. This is outside the scope of this PR though, so I did the quick fix for now.

I would have loved to add a unit test that has more access to the retriever structure to verify that the filters are being propagated as pre-filters, but that is difficult to do right now with the current rewrite logic. We rewrite all the way to a RankDocsRetrieverBuilder, with no stopping cue for when all the pre-search rewrite activities (such as pre-filter propagation) are complete. It would help a lot with testability if we could refactor the rewrite process to add such a stopping cue. In the meantime, I added a YAML test to cover this, but that doesn't give us visibility into whether the filters are applied as actual pre-filters.

...rrf/src/main/java/org/elasticsearch/xpack/rank/simplified/SimplifiedInnerRetrieverUtils.java

pmpailis · 2025-06-10T15:20:14Z

Just finished a first pass, seems really nice :) Will we also include doc changes & additional examples in retriever_examples to showcase this?

kderusso

LGTM, some minor feedback

...lugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/linear/LinearRetrieverBuilder.java

...-rrf/src/yamlRestTest/resources/rest-api-spec/test/linear/20_linear_retriever_simplified.yml

Mikep86 · 2025-06-10T17:43:23Z

@pmpailis

Will we also include doc changes & additional examples in retriever_examples to showcase this?

Yes, all that will come in a follow-up docs-focused PR :)

Samiul-TheSoccerFan

nice work 👏

jimczi

I like the approach, @Mikep86, it’s less invasive than I expected! 😉
I do have some concerns about the growing number of parameters in the linear retriever that aren't compatible with each other, but that's an inherent trade-off with the decision to add this functionality there, so I’m comfortable with it.

...rrf/src/main/java/org/elasticsearch/xpack/rank/simplified/SimplifiedInnerRetrieverUtils.java

jimczi · 2025-06-11T08:02:06Z

...-rrf/src/yamlRestTest/resources/rest-api-spec/test/linear/20_linear_retriever_simplified.yml

+  - match: { hits.hits.1._id: "2" }
+  - lte: { hits.hits.1._score: 2.0 }
+  - match: { hits.hits.2._id: "1" }
+  - lte: { hits.hits.2._score: 2.0 }


Can we also add a test with a filter? We need to make sure that the filter is propagated correctly.

Yep, working on a unit test to verify pre-filter propagation now

See #129200 (comment)

...rf/src/test/java/org/elasticsearch/xpack/rank/linear/LinearRetrieverBuilderParsingTests.java

pmpailis

Awesome work @Mikep86 ❤️ Only minor comment is the prefilter tests addition, but other than that it looks really nice.

jimczi

LGTM

Samiul-TheSoccerFan · 2025-06-13T16:10:45Z

docs/changelog/129200.yaml

@@ -0,0 +1,5 @@
+pr: 129200
+summary: Simplified Linear Retriever


Does this need to be updated? Probably something like Add simplified syntax and hybrid support to linear retriever.

I think this description is still succinct and accurate, good as is

elasticsearchmachine · 2025-06-16T17:27:06Z

💔 Backport failed

Status	Branch	Result
❌	8.19	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 129200

Mikep86 · 2025-06-17T15:00:09Z

💚 All backports created successfully

Status	Branch	Result
✅	8.19

Questions ?

Please refer to the Backport tool documentation

(cherry picked from commit fc77640) # Conflicts: # x-pack/plugin/inference/build.gradle # x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/RankRRFFeatures.java # x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/rrf/RRFRankBuilder.java

Mikep86 added 2 commits June 10, 2025 08:58

Simplified linear retriever

6da6bb2

Simplified retriever format feature

896f9ba

Mikep86 requested review from jimczi, kderusso and pmpailis June 10, 2025 13:42

Mikep86 added >enhancement auto-backport Automatically create backport pull requests when merged :SearchOrg/Relevance Label for the Search (solution/org) Relevance team :Search Relevance/Search Catch all for Search Relevance v8.19.0 v9.1.0 labels Jun 10, 2025

elasticsearchmachine added Team:SearchOrg Meta label for the Search Org (Enterprise Search) Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch Team:Search - Relevance The Search organization Search Relevance team labels Jun 10, 2025

Update docs/changelog/129200.yaml

4e88234

Fix changelog

cee5765

Mikep86 mentioned this pull request Jun 10, 2025

Simplified linear and RRF retrievers #128633

Closed

pmpailis reviewed Jun 10, 2025

View reviewed changes

kderusso approved these changes Jun 10, 2025

View reviewed changes

Samiul-TheSoccerFan approved these changes Jun 10, 2025

View reviewed changes

Centralized logic for creating a RetrieverSource from a RetrieverBuilder

3f7870a

jimczi reviewed Jun 11, 2025

View reviewed changes

pmpailis approved these changes Jun 11, 2025

View reviewed changes

Copy pre-filters during linear retriever rewrite

b891166

Mikep86 added 12 commits June 11, 2025 14:01

Added filter propagation test

fc57b61

Remove TODO

ad55a8f

Rename fields

05080f9

Added sparse vector query test

51eeaaf

Added index alias test

2c8085d

Renamed SimplifiedInnerRetrieverUtils to MultiFieldsInnerRetrieverUtils

773c33a

Merge branch 'main' into simplified-linear-retriever

6ff461d

Renamed and moved cluster feature

1c36d33

Removed references to simplified query format from error messages

bb2807c

Merge branch 'main' into simplified-linear-retriever

030968a

Added javadocs for MultiFieldsInnerRetrieverUtils

5717076

Added comment to LinearRetrieverBuilderParsingTests

caf0742

Mikep86 requested a review from jimczi June 12, 2025 20:56

jimczi approved these changes Jun 12, 2025

View reviewed changes

Samiul-TheSoccerFan reviewed Jun 13, 2025

View reviewed changes

Mikep86 merged commit fc77640 into elastic:main Jun 16, 2025
18 checks passed

elasticsearchmachine added the backport pending label Jun 16, 2025

Mikep86 mentioned this pull request Jun 17, 2025

[8.19] Simplified Linear Retriever (#129200) #129563

Merged

Mikep86 mentioned this pull request Jun 18, 2025

Simplified RRF Retriever #129659

Merged

Mikep86 removed the backport pending label Jun 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplified Linear Retriever#129200

Simplified Linear Retriever#129200
Mikep86 merged 18 commits intoelastic:mainfrom
Mikep86:simplified-linear-retriever

Mikep86 commented Jun 10, 2025

elasticsearchmachine commented Jun 10, 2025

elasticsearchmachine commented Jun 10, 2025

elasticsearchmachine commented Jun 10, 2025

elasticsearchmachine commented Jun 10, 2025

Uh oh!

Uh oh!

Uh oh!

pmpailis Jun 10, 2025

Mikep86 Jun 10, 2025

Mikep86 Jun 11, 2025

Uh oh!

pmpailis commented Jun 10, 2025

kderusso left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mikep86 commented Jun 10, 2025

Samiul-TheSoccerFan left a comment

jimczi left a comment

Uh oh!

jimczi Jun 11, 2025

Mikep86 Jun 11, 2025

Mikep86 Jun 11, 2025

Uh oh!

pmpailis left a comment

jimczi left a comment

Samiul-TheSoccerFan Jun 13, 2025

Mikep86 Jun 16, 2025

Uh oh!

elasticsearchmachine commented Jun 16, 2025

Mikep86 commented Jun 17, 2025

Labels

6 participants

		@@ -0,0 +1,5 @@
		pr: 129200
		summary: Simplified Linear Retriever

Conversation

Mikep86 commented Jun 10, 2025

elasticsearchmachine commented Jun 10, 2025

elasticsearchmachine commented Jun 10, 2025

elasticsearchmachine commented Jun 10, 2025

elasticsearchmachine commented Jun 10, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

pmpailis commented Jun 10, 2025

kderusso left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mikep86 commented Jun 10, 2025

Samiul-TheSoccerFan left a comment

Choose a reason for hiding this comment

jimczi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

pmpailis left a comment

Choose a reason for hiding this comment

jimczi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Jun 16, 2025

💔 Backport failed

Mikep86 commented Jun 17, 2025

💚 All backports created successfully

Questions ?

Labels

6 participants