Fixing sorted indices for GPU built indices#138138
Fixing sorted indices for GPU built indices#138138mayya-sharipova merged 2 commits intoelastic:mainfrom
Conversation
mayya-sharipova
commented
Nov 16, 2025
- For flush, vectors are now reordered according to sortMap before building the GPU index, ensuring that HNSW graph node ordinals match the sorted document order.
- Merge on the other hand doesn't require explicit sortMap handling since Lucene's MergedVecto utilities apply docMaps internally.
- Enhanced tests with both approximate and exact KNN searches to validate sorting correctness.
- For flush, vectors are now reordered according to sortMap before building the GPU index, ensuring that HNSW graph node ordinals match the sorted document order. - Merge on the other hand doesn't require explicit sortMap handling since Lucene's MergedVecto utilities apply docMaps internally. - Enhanced tests with both approximate and exact KNN searches to validate sorting correctness.
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
|
Hi @mayya-sharipova, I've created a changelog YAML for you. |
ldematte
left a comment
There was a problem hiding this comment.
Change looks good to me; I'm not a Lucene expert so I cannot say if it's the right way to do it so I'll trust you/Chris on this.
You probably want to merge in changes from #138155 and add the test-gpu flag to be sure tests pass/are OK.
ChrisHegarty
left a comment
There was a problem hiding this comment.
The changes look good to me. I guess I'm surprised to not see some changes to ES92GpuHnswVectorsFormatTests! Is there no sorting tests there?
These tests use tests inside BaseKnnVectorsFormatTestCase, such as @ChrisHegarty Do you suggest we need to add tests into ES92GpuHnswVectorsFormatTests, I can do that |
|
First, I think that this PR is good to be merged as-is.
Right. It surprises me that sorting was completely unimplemented, and that no scenarios in |
|
Thanks Chris, I will merge this PR and look into adding more sorted index test into BaseKnnVectorsFormatTestCase |
💚 Backport successful
|
- For flush, vectors are now reordered according to sortMap before building the GPU index, ensuring that HNSW graph node ordinals match the sorted document order. - Merge on the other hand doesn't require explicit sortMap handling since Lucene's MergedVecto utilities apply docMaps internally. - Enhanced tests with both approximate and exact KNN searches to validate sorting correctness.
- For flush, vectors are now reordered according to sortMap before building the GPU index, ensuring that HNSW graph node ordinals match the sorted document order. - Merge on the other hand doesn't require explicit sortMap handling since Lucene's MergedVecto utilities apply docMaps internally. - Enhanced tests with both approximate and exact KNN searches to validate sorting correctness.