Skip to content

[ES|QL] pushing down eval expression when it requires data access #136610

Merged
pmpailis merged 33 commits intoelastic:mainfrom
pmpailis:fix_133462_fixing_score_function_error_no_shards
Oct 21, 2025
Merged

[ES|QL] pushing down eval expression when it requires data access #136610
pmpailis merged 33 commits intoelastic:mainfrom
pmpailis:fix_133462_fixing_score_function_error_no_shards

Conversation

@pmpailis
Copy link
Contributor

@pmpailis pmpailis commented Oct 15, 2025

We want to ensure that the SCORE function will be executed on data nodes even when dealing with eval FUNCTION. To that end, in this PR we update the PushDownAndCombineLimits rule so that LIMIT is not pushed further from an EVAL expression if:

  • any of the expressions is SCORE
  • rule is applied on the coordinator node

By doing so, the eval is pushed to the data node, where the check above does not hold, so that the limit is pushed and reaches the underlying LuceneOperator.

Closes #133462.

@pmpailis pmpailis changed the title [ES|QL] pushing down score function Oct 16, 2025
@pmpailis pmpailis added >non-issue :Search Relevance/ES|QL Search functionality in ES|QL >bug and removed >non-issue labels Oct 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @pmpailis, I've created a changelog YAML for you.

…thub.com:pmpailis/elasticsearch into fix_133462_fixing_score_function_error_no_shards
@pmpailis pmpailis marked this pull request as ready for review October 17, 2025 10:32
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Oct 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

required_capability: pushing_down_eval_with_score

FROM books
| EVAL title_match = SCORE(MATCH(title, "rings"))
Copy link
Contributor Author

@pmpailis pmpailis Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also works with the fork command, e.g.

FROM index 
| FORK 
        (EVAL score1 = SCORE(MATCH(TITLE, "LOTR"))) 
        (EVAL score2 = SCORE(MATCH(TITLE, "ring")))

Should I add an explicit test-case for fork as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure that's necessary, as other functions are not explicitly tested with FORK. @ioanatia do you think this is necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Carlos! Will proceed with merging and we can add additional tests in a follow up if needed!

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. A couple of comments on testing but looks 💯


FROM books
| EVAL title_match = SCORE(MATCH(title, "rings"))
| LIMIT 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this test is reproducible if we don't have a SORT 🤔 , as the order is undefined and we may get some score back...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, fair point. Tried this with 1000 iterations and got no failures but will change the query term to aardvark to make sure!

assertThat(query.limit(), is(l(42)));
assertThat(query.query(), is(unscore(existsQuery("last_name"))));
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be interesting to add a test case for using the EVAL result in the Top N, something like:

            FROM test
            | EVAL s = SCORE(MATCH(first_name, "foo"))
            | WHERE last_name IS NOT NULL
            | SORT s
            | LIMIT 42
            | SORT last_name
            | LIMIT 2
* This is used to prevent pushing down limits past operations that need to evaluate expressions using document data.
*/
private boolean evalAliasNeedsData(Alias alias) {
ArrayDeque<Expression> exprStack = new ArrayDeque<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - you're using a Dequeue for traversing the tree. I think it would be more idiomatic if you do something like:

            Holder<Boolean> hasScore = new Holder<>(false);
            p.forEachExpression(expr -> {
                if (expr instanceof Score) {
                    hasScore.set(true);
                }
            });
            return hasScore.get();
Copy link
Contributor Author

@pmpailis pmpailis Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, makes sense! Will update as suggested !

@pmpailis pmpailis merged commit 188e6cd into elastic:main Oct 21, 2025
34 checks passed
chrisparrinello pushed a commit to chrisparrinello/elasticsearch that referenced this pull request Oct 24, 2025
fzowl pushed a commit to voyage-ai/elasticsearch that referenced this pull request Nov 3, 2025
@pmpailis pmpailis deleted the fix_133462_fixing_score_function_error_no_shards branch January 16, 2026 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Search Relevance/ES|QL Search functionality in ES|QL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.3.0

3 participants