Add on-disk rescoring to disk BBQ#135778
Conversation
9af249c to
bf48533
Compare
bf48533 to
cf67fc8
Compare
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
|
Hi @thecoop, I've created a changelog YAML for you. |
|
The option on serverless should just not do anything. We need to check this is the case already due to |
| @FunctionalInterface | ||
| public interface GetFormatReader { | ||
| FlatVectorsReader getReader(String formatName, boolean useDirectIO) throws IOException; | ||
| } |
There was a problem hiding this comment.
Do we want to make this a more general interface? Maybe put it in its own class outside of IVF reader?
| */ | ||
| public abstract class IVFVectorsReader extends KnnVectorsReader { | ||
|
|
||
| private record FlatVectorsReaderKey(String formatName, boolean useDirectIO) { |
There was a problem hiding this comment.
Maybe put it in its own class? Wouldn't we want this same thing for other formats eventually?
There was a problem hiding this comment.
I'm thinking maybe a shared superclass for the per-field specifications, but I want to do that refactor as part of #135931, after this is merged in
benwtrent
left a comment
There was a problem hiding this comment.
I really love how (relatively) simple this change is now.
I have some concerns on ensuring bwc tests. but overall I like the change.
server/src/main/java/org/elasticsearch/index/codec/vectors/diskbbq/IVFVectorsWriter.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/codec/vectors/diskbbq/IVFVectorsWriter.java
Show resolved
Hide resolved
Yup, this is the case. The option on serverless can be set, but it doesn't do anything. |
benwtrent
left a comment
There was a problem hiding this comment.
Minor comments, overall, this is very nice :)
| } | ||
|
|
||
| // for testing | ||
| KnnVectorsWriter version0FieldsWriter(SegmentWriteState state) throws IOException { |
server/src/main/java/org/elasticsearch/index/codec/vectors/diskbbq/IVFVectorsWriter.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/codec/vectors/diskbbq/IVFVectorsWriter.java
Outdated
Show resolved
Hide resolved
| } | ||
| } | ||
|
|
||
| public void testDirectIOBackwardsCompatibleRead() throws IOException { |
There was a problem hiding this comment.
I think this is OK, but I would prefer a new ES920V0DiskBBQVectorsFormatTests.java that runs ALL our test coverage for the byte change. But, eh, this is OK I think.
Adds an
on_disk_rescoreindex option to disk BBQ. This enables on-disk rescoring for the original raw vectors, so random pages are not copied into memory unnecessarily