Speed up sorts on secondary sort fields by romseygeek · Pull Request #137533 · elastic/elasticsearch

romseygeek · 2025-11-03T16:34:25Z

This adds a competitive iterator implementation that will take advantage of doc
value skippers in the case that:

the index is sorted by a low cardinality field like hostname, and then by a high cardinality
field like timestamp
skippers are enabled on both of these fields
the query is sorted by the high cardinality field.

To be able to plug this new implementation into the lucene sort architecture, we need
to fork NumericComparator and some associated classes. LongValuesComparatorSource
now returns the forked version with the new competitive iterator builder.

… with the recorded bottom. When construction a new CompetitiveDISIBuilder, then check whether global min/max points or global min/max doc values skipper are comparative with the bottom. If so, then update competitiveIterator with an empty iterator, because no documents will have a value that is competitive with the current recorded bottom in the current segment. Doing this at CompetitiveDISIBuilder construction is cheap and allows to immediately prune, instead of waiting until doUpdateCompetitiveIterator(...) is invoked.

…g/sort-still-slow

elasticsearchmachine · 2025-11-03T16:34:50Z

Pinging @elastic/es-search (Team:Search)

elasticsearchmachine · 2025-11-03T16:34:50Z

Hi @romseygeek, I've created a changelog YAML for you.

…g/sort-still-slow

romseygeek · 2025-11-06T11:00:29Z

Buildkite benchmark this with elastic/logs please

romseygeek · 2025-11-06T11:07:22Z

Buildkite benchmark this with elastic-logs please

elasticmachine · 2025-11-06T11:09:23Z

💚 Build Succeeded

Buildkite Build
Commit: 2dc921e
Baseline: ae77575 (env ID bf8d8c02-dcd8-4921-86f0-c4cd34373423)
Contender: 2dc921e (env ID 58424f1c-bda9-42df-bd42-118623021817)
Benchmark results

This build ran two elastic-logs benchmarks to evaluate performance impact of this PR.

History

💔 Build #41 failed 2dc921e

cc @romseygeek

martijnvg

Nice work 👍 . I left a few comments, but otherwise LGTM.

martijnvg · 2025-11-06T13:06:23Z

...r/src/main/java/org/elasticsearch/index/fielddata/fieldcomparator/SecondarySortIterator.java

+ * range, using DocValueSkippers on the primary sort field to advance rapidly
+ * to the next block of values.
+ */
+class SecondarySortIterator extends DocIdSetIterator {


Maybe add a unit test for this iterator? Maybe we can do a test duel against DocValuesRangeIterator wrapped in an interator?

martijnvg · 2025-11-06T13:09:32Z

.../main/java/org/elasticsearch/index/fielddata/fieldcomparator/LongValuesComparatorSource.java

+                        }
+                        DocValuesSkipper skipper = context.reader().getDocValuesSkipper(field);
+                        DocValuesSkipper primaryFieldSkipper = context.reader().getDocValuesSkipper(sortFields[0].getField());
+                        if (primaryFieldSkipper == null || skipper.docCount() != maxDoc || primaryFieldSkipper.docCount() != maxDoc) {


In the case primaryFieldSkipper is null, then the secondary sort field can be treated as primary sort and we can go faster (like in SortedNumericDocValuesRangeQuery#getDocIdSetIteratorOrNullForPrimarySort). But I don't think this happens now? Because super.buildCompetitiveDISIBuilder(context); will not detect this?

The same applies if primary sort field has just one value.

If this is true. Let's address this then in a followup?

martijnvg · 2025-11-06T13:11:56Z

server/src/main/java/org/elasticsearch/lucene/comparators/XNumericComparator.java

+            competitiveDISIBuilder = buildCompetitiveDISIBuilder(context);
+        }
+
+        protected CompetitiveDISIBuilder buildCompetitiveDISIBuilder(LeafReaderContext context) throws IOException {


This is the bit we want to contribute to upstream?

Yes, exactly

…g/sort-still-slow

This adds a competitive iterator implementation that will take advantage of doc value skippers in the case that: * the index is sorted by a low cardinality field like hostname, and then by a high cardinality field like timestamp * skippers are enabled on both of these fields * the query is sorted by the high cardinality field. To be able to plug this new implementation into the lucene sort architecture, we need to fork NumericComparator and some associated classes. LongValuesComparatorSource now returns the forked version with the new competitive iterator builder.

romseygeek and others added 13 commits October 29, 2025 09:45

WIP: pluggable competitiveDISIBuilder and secondary sort-based iterator

99d1db6

wip

45882c8

Force use of competitive comparators

f72677c

tests

4e704f1

Add competitive sort tests to MapperTestCase

ecea998

iter

842d14c

Merge branch 'sort/base-fieldmapper-sort-tests' into bug/sort-still-slow

3e1c2d2

Add competitive iterator check for logsdb-style timestamp field

b04481e

Merge remote-tracking branch 'origin/main' into bug/sort-still-slow

63b9329

Add indexSort method to IndexFieldData

f9f8191

Merge remote-tracking branch 'romseygeek/bug/sort-still-slow' into bu…

08cea96

…g/sort-still-slow

Merge remote-tracking branch 'origin/main' into bug/sort-still-slow

1eb03cd

romseygeek requested review from martijnvg and mayya-sharipova November 3, 2025 16:34

romseygeek self-assigned this Nov 3, 2025

romseygeek added >enhancement :Search/Search Search-related issues that do not fall into other categories v9.3.0 labels Nov 3, 2025

elasticsearchmachine added the Team:Search Meta label for search team label Nov 3, 2025

Update docs/changelog/137533.yaml

8603d58

elasticsearchmachine and others added 7 commits November 3, 2025 16:41

[CI] Auto commit changes from spotless

0595fff

spotless

3a933b4

Merge remote-tracking branch 'romseygeek/bug/sort-still-slow' into bu…

1548a2f

…g/sort-still-slow

tests

79bf9b2

Merge remote-tracking branch 'origin/main' into bug/sort-still-slow

32201c0

[CI] Auto commit changes from spotless

c3b270c

need to add iterators back in for pruning - follow up

dd1f994

romseygeek and others added 9 commits November 4, 2025 15:13

Merge remote-tracking branch 'romseygeek/bug/sort-still-slow' into bu…

0b79c60

…g/sort-still-slow

dont' advance past maxdoc

4fa6f8e

compilation

1249286

added logic from lucene pr

0e0e392

fencepost

cb9a7f5

Merge remote-tracking branch 'romseygeek/bug/sort-still-slow' into bu…

e32f535

…g/sort-still-slow

only run accelerator on dense fields

d0a2218

Merge remote-tracking branch 'origin/main' into bug/sort-still-slow

e603627

Revert subclassing nonsense

2dc921e

martijnvg approved these changes Nov 6, 2025

View reviewed changes

romseygeek added 6 commits November 6, 2025 15:49

Add unit test for SSI

25dbeb4

Merge branch 'main' into bug/sort-still-slow

fe98352

Merge branch 'main' into bug/sort-still-slow

8c495a5

Merge remote-tracking branch 'origin/main' into bug/sort-still-slow

7485e03

conflicts

3b6d308

Merge remote-tracking branch 'romseygeek/bug/sort-still-slow' into bu…

8460a14

…g/sort-still-slow

romseygeek enabled auto-merge (squash) November 9, 2025 15:38

romseygeek disabled auto-merge November 9, 2025 15:39

romseygeek merged commit 2236cbf into elastic:main Nov 9, 2025
33 of 34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up sorts on secondary sort fields#137533

Speed up sorts on secondary sort fields#137533
romseygeek merged 36 commits intoelastic:mainfrom
romseygeek:bug/sort-still-slow

romseygeek commented Nov 3, 2025 •

edited

Loading

elasticsearchmachine commented Nov 3, 2025

elasticsearchmachine commented Nov 3, 2025

romseygeek commented Nov 6, 2025

romseygeek commented Nov 6, 2025

elasticmachine commented Nov 6, 2025 •

edited

Loading

martijnvg left a comment

martijnvg Nov 6, 2025

martijnvg Nov 6, 2025

martijnvg Nov 6, 2025

romseygeek Nov 6, 2025

Uh oh!

Labels

4 participants

Conversation

romseygeek commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

elasticsearchmachine commented Nov 3, 2025

elasticsearchmachine commented Nov 3, 2025

romseygeek commented Nov 6, 2025

romseygeek commented Nov 6, 2025

elasticmachine commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💚 Build Succeeded

History

martijnvg left a comment

Choose a reason for hiding this comment

martijnvg Nov 6, 2025

Choose a reason for hiding this comment

martijnvg Nov 6, 2025

Choose a reason for hiding this comment

martijnvg Nov 6, 2025

Choose a reason for hiding this comment

romseygeek Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Labels

4 participants

romseygeek commented Nov 3, 2025 •

edited

Loading

elasticmachine commented Nov 6, 2025 •

edited

Loading