ES|QL: Pass fix size instead of maxPageSize to LuceneTopNOperator scorer by ioanatia · Pull Request #135767 · elastic/elasticsearch

ioanatia · 2025-10-01T11:07:24Z

Initially we passed scorer.leafReaderContext().reader().maxDoc() to the bulk scorer, which showed great improvements.
However this will also keep the current thread busy until it is able to score all the docs from the current reader.
There are ways we can improve this, but they would require more changes.

For now I wanted to push a quick fix to mitigate some of the regressions we see, with the maxPageSize becoming smaller after #108412 .
We should still see a significant performance boost just by scoring docs in batches of 4096 docs, instead of maxPageSize.

elasticsearchmachine · 2025-10-02T10:51:58Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

nik9000 · 2025-10-02T13:28:54Z

...in/esql/compute/src/main/java/org/elasticsearch/compute/lucene/LuceneTopNSourceOperator.java

            }
            var leafCollector = perShardCollector.getLeafCollector(scorer.leafReaderContext());
-            scorer.scoreNextRange(leafCollector, scorer.leafReaderContext().reader().getLiveDocs(), maxPageSize);
+            scorer.scoreNextRange(leafCollector, scorer.leafReaderContext().reader().getLiveDocs(), NUM_DOCS_INTERVAL);


Do we need to lock this to the max of the range?

It looks like CancellableBulkScorer makes this bigger and bigger with time. But I think this is good and we can get it in and iterate.

We do something similar here, at least we try to avoid overflows:

elasticsearch/x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/lucene/LuceneOperator.java

Lines 241 to 246 in 7cbc236

void scoreNextRange(LeafCollector collector, Bits acceptDocs, int numDocs) throws IOException {

assert isDone() == false : "scorer is exhausted";

// avoid overflow and limit the range

numDocs = Math.min(maxPosition - position, numDocs);

assert numDocs > 0 : "scorer was exhausted";

position = bulkScorer.score(collector, acceptDocs, position, Math.min(maxPosition, position + numDocs));

Ah. got it.

kderusso

LGTM, but consider labeling with >bug and creating a release note entry as this would be impacting serverless performance?

carlosdelest

Awesome finding! 🚤

elasticsearchmachine · 2025-10-02T15:19:11Z

Hi @ioanatia, I've created a changelog YAML for you.

nik9000 · 2025-10-02T16:40:17Z

...in/esql/compute/src/main/java/org/elasticsearch/compute/lucene/LuceneTopNSourceOperator.java

            }
            var leafCollector = perShardCollector.getLeafCollector(scorer.leafReaderContext());
-            scorer.scoreNextRange(leafCollector, scorer.leafReaderContext().reader().getLiveDocs(), maxPageSize);
+            scorer.scoreNextRange(leafCollector, scorer.leafReaderContext().reader().getLiveDocs(), NUM_DOCS_INTERVAL);


Ah. got it.

…corer (elastic#135767)

elasticsearchmachine · 2025-10-02T18:36:43Z

💚 Backport successful

Status	Branch	Result
✅	9.2

…corer (#135767) (#135866)

…corer (elastic#135767)

…corer (#135767) (#136156)

Pass max doc to scorer instead of maxPageSize

add472e

ioanatia added >non-issue Team:Search - Relevance The Search organization Search Relevance team v9.2.0 :Search Relevance/ES|QL Search functionality in ES|QL labels Oct 1, 2025

elasticsearchmachine added v9.3.0 and removed v9.2.0 labels Oct 2, 2025

ioanatia added 2 commits October 2, 2025 10:52

Merge remote-tracking branch 'elasticsearch/main' into bulk_scorer_fix

123a4f6

Score docs in fixed sized batches

fb17f8b

ioanatia changed the title ~~ES|QL: Pass max doc to scorer instead of maxPageSize~~ Oct 2, 2025

ioanatia added auto-backport Automatically create backport pull requests when merged v9.2.0 labels Oct 2, 2025

ioanatia marked this pull request as ready for review October 2, 2025 10:51

elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Oct 2, 2025

elasticsearchmachine removed the Team:Search - Relevance The Search organization Search Relevance team label Oct 2, 2025

ioanatia requested review from carlosdelest and nik9000 October 2, 2025 10:52

nik9000 reviewed Oct 2, 2025

View reviewed changes

Fix test - I have no idea why this works

d403f49

ioanatia requested a review from nik9000 October 2, 2025 13:46

kderusso approved these changes Oct 2, 2025

View reviewed changes

carlosdelest approved these changes Oct 2, 2025

View reviewed changes

ioanatia added >bug and removed >non-issue labels Oct 2, 2025

Update docs/changelog/135767.yaml

8009eba

nik9000 approved these changes Oct 2, 2025

View reviewed changes

ioanatia merged commit 1521291 into elastic:main Oct 2, 2025
34 of 35 checks passed

ioanatia deleted the bulk_scorer_fix branch October 2, 2025 18:36

ioanatia added a commit to ioanatia/elasticsearch that referenced this pull request Oct 2, 2025

ES|QL: Pass fixed size instead of maxPageSize to LuceneTopNOperator s…

7b4aa4d

…corer (elastic#135767)

ioanatia mentioned this pull request Oct 2, 2025

[9.2] ES|QL: Pass fixed size instead of maxPageSize to LuceneTopNOperator scorer (#135767) #135866

Merged

elasticsearchmachine pushed a commit that referenced this pull request Oct 3, 2025

ES|QL: Pass fixed size instead of maxPageSize to LuceneTopNOperator s…

9a5ad11

…corer (#135767) (#135866)

ioanatia added a commit to ioanatia/elasticsearch that referenced this pull request Oct 8, 2025

ES|QL: Pass fixed size instead of maxPageSize to LuceneTopNOperator s…

f9d08c1

…corer (elastic#135767)

ioanatia mentioned this pull request Oct 8, 2025

Backport ES|QL: Pass fixed size instead of maxPageSize (#135767) #136156

Merged

ioanatia added a commit that referenced this pull request Oct 10, 2025

ES|QL: Pass fixed size instead of maxPageSize to LuceneTopNOperator s…

9881d4a

…corer (#135767) (#136156)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ES|QL: Pass fix size instead of maxPageSize to LuceneTopNOperator scorer#135767

ES|QL: Pass fix size instead of maxPageSize to LuceneTopNOperator scorer#135767
ioanatia merged 5 commits intoelastic:mainfrom
ioanatia:bulk_scorer_fix

ioanatia commented Oct 1, 2025 •

edited

Loading

elasticsearchmachine commented Oct 2, 2025

nik9000 Oct 2, 2025

ioanatia Oct 2, 2025

nik9000 Oct 2, 2025

kderusso left a comment

carlosdelest left a comment

elasticsearchmachine commented Oct 2, 2025

nik9000 Oct 2, 2025

Uh oh!

elasticsearchmachine commented Oct 2, 2025

Labels

5 participants

	void scoreNextRange(LeafCollector collector, Bits acceptDocs, int numDocs) throws IOException {
	assert isDone() == false : "scorer is exhausted";
	// avoid overflow and limit the range
	numDocs = Math.min(maxPosition - position, numDocs);
	assert numDocs > 0 : "scorer was exhausted";
	position = bulkScorer.score(collector, acceptDocs, position, Math.min(maxPosition, position + numDocs));

Conversation

ioanatia commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

elasticsearchmachine commented Oct 2, 2025

nik9000 Oct 2, 2025

Choose a reason for hiding this comment

ioanatia Oct 2, 2025

Choose a reason for hiding this comment

nik9000 Oct 2, 2025

Choose a reason for hiding this comment

kderusso left a comment

Choose a reason for hiding this comment

carlosdelest left a comment

Choose a reason for hiding this comment

elasticsearchmachine commented Oct 2, 2025

nik9000 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Oct 2, 2025

💚 Backport successful

Labels

5 participants

ioanatia commented Oct 1, 2025 •

edited

Loading