Enable bfloat16 and on-disk rescoring for dense vectors#138492
Enable bfloat16 and on-disk rescoring for dense vectors#138492thecoop merged 30 commits intoelastic:mainfrom
Conversation
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
|
Hi @thecoop, I've created a changelog YAML for you. |
🔍 Preview links for changed docs |
ℹ️ Important: Docs version tagging👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version. We use applies_to tags to mark version-specific features and changes. Expand for a quick overviewWhen to use applies_to tags:✅ At the page level to indicate which products/deployments the content applies to (mandatory) What NOT to do:❌ Don't remove or replace information that applies to an older version 🤔 Need help?
|
|
Hi @thecoop, I've updated the changelog YAML for you. Note that since this PR is labelled |
ee094b9 to
bf465d5
Compare
shainaraskas
left a comment
There was a problem hiding this comment.
some suggestions for signaling availability
shainaraskas
left a comment
There was a problem hiding this comment.
added some comments to the changelog entry while I was answering your question :)
shainaraskas
left a comment
There was a problem hiding this comment.
added some comments to the changelog entry while I was answering your question :)
2464888 to
f5da14b
Compare
jimczi
left a comment
There was a problem hiding this comment.
Let's split this in two please, one PR to enable bfloat16 and another one for rescoring dense vectors.
...nce/src/test/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapperTests.java
Show resolved
Hide resolved
jimczi
left a comment
There was a problem hiding this comment.
I understand that we have a single feature flag here so we cannot split. I am approving to unblock but let's at least have a separate change log that we can link with the appropriate issue.
This enables the bfloat16 element_type and on-disk rescoring options for dense_vector indexes
New options have been added to the
dense_vectorfield type.The first is support for storing vectors in bfloat16 format. This is a floating-point format that utilises two bytes per value rather than four, halving the storage space required compared to
element_type: float. This can be specified withelement_type: bfloat16when creating the index, for alldense_vectorindexing types:Float values are automatically rounded to 2-bytes when writing to disk, so this format can be used with original source vectors at 2- or 4-byte precision. BFloat16 values are zero-expanded back to 4-byte floats when read into memory. Using
bfloat16will cause a loss of precision compared to the original vector values, as well as a small performance hit due to converting betweenbfloat16andfloatwhen reading and writing vectors; however this may be counterbalanced by a corresponding decrease in I/O, depending on your workload.The second option is to utilise on-disk rescoring. When rescoring vectors during kNN searches, the raw vectors are read into memory. When the vector data is larger than the amount of available RAM, this might cause the OS to evict some in-memory pages that then need to be paged back in immediately afterwards. This can cause a significant slowdown in search speed. Utilising on-disk rescoring causes rescoring to use raw vector data on-disk during rescoring, and to not read it into memory first. This may significantly increase search performance in such low-memory situations.
This is specified as an index option for HNSW vector index types:
bfloat16 support is GA, on-disk rescoring is preview functionality.
Doc changes: elastic/docs-content#3847
Spec changes: elastic/elasticsearch-specification#5728