Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
58a03e5
Update docs on dense_vector for new types
thecoop Nov 7, 2025
5ee9ea1
Remove feature flag
thecoop Nov 24, 2025
248c9f6
Doc updates
thecoop Nov 24, 2025
492b5aa
Update docs/changelog/138492.yaml
thecoop Nov 24, 2025
74987de
Update docs/changelog/138492.yaml
thecoop Nov 24, 2025
a44267b
Docs
thecoop Nov 24, 2025
7270d90
Disallow bfloat16 with semantic_text at the moment
thecoop Nov 24, 2025
a6bffff
Update semantic text tests
thecoop Nov 24, 2025
47d666e
Add an index version
thecoop Nov 25, 2025
387bb35
Merge branch 'main' into enable-generic-vector-formats
thecoop Nov 25, 2025
bf465d5
Remove feature flag refrences
thecoop Nov 25, 2025
056bd28
Another one
thecoop Nov 25, 2025
171d1d3
Update docs/changelog/138492.yaml
thecoop Nov 25, 2025
678463e
Fix changelog
thecoop Nov 25, 2025
bd6c612
Merge branch 'main' into enable-generic-vector-formats
thecoop Nov 26, 2025
ad48d03
Merge branch 'main' into enable-generic-vector-formats
thecoop Nov 26, 2025
3ecf6ac
Merge branch 'main' into enable-generic-vector-formats
thecoop Nov 27, 2025
5dded6d
Merge branch 'main' into enable-generic-vector-formats
thecoop Nov 27, 2025
b8497d1
Update changelog
thecoop Nov 28, 2025
b395f63
Merge branch 'main' into enable-generic-vector-formats
thecoop Nov 28, 2025
73b01ad
Get the right feature flags in
thecoop Nov 28, 2025
6d9455f
Merge branch 'main' into enable-generic-vector-formats
thecoop Dec 1, 2025
40a3403
Doc updates
thecoop Dec 1, 2025
dfe27ef
Don't use bfloat16 here
thecoop Dec 1, 2025
178df59
Merge branch 'main' into enable-generic-vector-formats
thecoop Dec 1, 2025
677898f
Merge branch 'main' into enable-generic-vector-formats
thecoop Dec 1, 2025
e89db84
Rounding, not truncating
thecoop Dec 4, 2025
f5da14b
Merge branch 'main' into enable-generic-vector-formats
thecoop Dec 4, 2025
ac62b97
Merge branch 'main' into enable-generic-vector-formats
thecoop Dec 5, 2025
ccf8193
Merge branch 'main' into enable-generic-vector-formats
thecoop Dec 5, 2025
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions docs/changelog/138492.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
pr: 138492
summary: Enable bfloat16 and on-disk rescoring for dense vectors
area: Vector Search
type: feature
issues: []
highlight:
title: New dense_vector options for storing bfloat16 vectors and utilising on-disk
rescoring
body: |-
New options have been added to the `dense_vector` field type.

The first is support for storing vectors in bfloat16 format.
This is a floating-point format that utilises two bytes per value rather than four, halving the storage space
required compared to `element_type: float`. This can be specified with `element_type: bfloat16`
when creating the index, for all `dense_vector` indexing types.

Float values are automatically rounded to two bytes when writing to disk, so this format can be used
with original source vectors at two- or four-byte precision. BFloat16 values are zero-expanded back to four-byte floats
when read into memory. Using `bfloat16` will cause a loss of precision compared to
the original vector values, as well as a small performance hit due to converting between `bfloat16` and `float`
when reading and writing vectors; however this may be counterbalanced by a corresponding decrease in I/O,
depending on your workload.

The second option is to enable on-disk rescoring. When rescoring vectors during kNN searches, the raw vectors
are read into memory. When the vector data is larger than the amount of available RAM, this might cause the OS
to evict some in-memory pages that then need to be paged back in immediately afterwards. This can cause
a significant slowdown in search speed. Enabling on-disk rescoring causes rescoring to use raw vector data
on-disk during rescoring, and to not read it into memory first. This can significantly increase search performance
in such low-memory situations.

Enable on-disk rescoring using the `on_disk_rescore: true` index option.
notable: true
24 changes: 15 additions & 9 deletions docs/reference/elasticsearch/mapping-reference/dense-vector.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ This setting is compatible with synthetic `_source`, where the entire `_source`

### Rehydration and precision

When vector values are rehydrated (e.g., for reindex, recovery, or explicit `_source` requests), they are restored from their internal format. Internally, vectors are stored at float precision, so if they were originally indexed as higher-precision types (e.g., `double` or `long`), the rehydrated values will have reduced precision. This lossy representation is intended to save space while preserving search quality.
When vector values are rehydrated (e.g., for reindex, recovery, or explicit `_source` requests), they are restored from their internal format. By default, vectors are stored at float precision, so if they were originally indexed as higher-precision types (e.g., `double` or `long`), the rehydrated values will have reduced precision. This lossy representation is intended to save space while preserving search quality. Additionally, using an `element_type` of `bfloat16` will cause a further loss in precision in restored vectors.

### Storing original vectors in `_source`

Expand Down Expand Up @@ -283,12 +283,15 @@ The following mapping parameters are accepted:
$$$dense-vector-element-type$$$

`element_type`
: (Optional, string) The data type used to encode vectors. The supported data types are `float` (default), `byte`, and `bit`.
: (Optional, string) The data type used to encode vectors.

::::{dropdown} Valid values for element_type
`float`
: indexes a 4-byte floating-point value per dimension. This is the default value.

`bfloat16` {applies_to}`stack: ga 9.3`
: indexes a 2-byte floating-point value per dimension. This uses the bfloat16 encoding, _not_ IEEE-754 float16, to maintain the same value range as 4-byte floats. Using `bfloat16` is likely to cause a loss of precision in the stored values compared to `float`.

`byte`
: indexes a 1-byte integer value per dimension.

Expand Down Expand Up @@ -353,16 +356,16 @@ $$$dense-vector-index-options$$$
* `int8_hnsw` - The default index type for some float vectors:
* {applies_to}`stack: ga 9.1` Default for float vectors with less than 384 dimensions.
* {applies_to}`stack: ga 9.0` Default for float all vectors.
This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 4x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
* `int4_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 8x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
* `bbq_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically binary quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 32x at the cost of accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float` or `bfloat16`. This can reduce the memory footprint by 4x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
* `int4_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float` or `bfloat16`. This can reduce the memory footprint by 8x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
* `bbq_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically binary quantization for scalable approximate kNN search with `element_type` of `float` or `bfloat16`. This can reduce the memory footprint by 32x at the cost of accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).

{applies_to}`stack: ga 9.1` `bbq_hnsw` is the default index type for float vectors with greater than or equal to 384 dimensions.
* `flat` - This utilizes a brute-force search algorithm for exact kNN search. This supports all `element_type` values.
* `int8_flat` - This utilizes a brute-force search algorithm in addition to automatic scalar quantization. Only supports `element_type` of `float`.
* `int4_flat` - This utilizes a brute-force search algorithm in addition to automatic half-byte scalar quantization. Only supports `element_type` of `float`.
* `bbq_flat` - This utilizes a brute-force search algorithm in addition to automatic binary quantization. Only supports `element_type` of `float`.
* {applies_to}`stack: ga 9.2` `bbq_disk` - This utilizes a variant of [k-means clustering algorithm](https://en.wikipedia.org/wiki/K-means_clustering) in addition to automatic binary quantization to partition vectors and search subspaces rather than an entire graph structure as in with HNSW. Only supports `element_type` of `float`. This combines the benefits of BBQ quantization with partitioning to further reduces the required memory overhead when compared with HNSW and can effectively be run at the smallest possible RAM and heap sizes when HNSW would otherwise cause swapping and grind to a halt. DiskBBQ largely scales linearly with the total RAM. And search performance is enhanced at scale as a subset of the total vector space is loaded. This requires an [Enterprise subscription](https://www.elastic.co/subscriptions).
* `int8_flat` - This utilizes a brute-force search algorithm in addition to automatic scalar quantization. Only supports `element_type` of `float` or `bfloat16`.
* `int4_flat` - This utilizes a brute-force search algorithm in addition to automatic half-byte scalar quantization. Only supports `element_type` of `float` or `bfloat16`.
* `bbq_flat` - This utilizes a brute-force search algorithm in addition to automatic binary quantization. Only supports `element_type` of `float` or `bfloat16`.
* {applies_to}`stack: ga 9.2` `bbq_disk` - This utilizes a variant of [k-means clustering algorithm](https://en.wikipedia.org/wiki/K-means_clustering) in addition to automatic binary quantization to partition vectors and search subspaces rather than an entire graph structure as in with HNSW. Only supports `element_type` of `float` or `bfloat16`. This combines the benefits of BBQ quantization with partitioning to further reduces the required memory overhead when compared with HNSW and can effectively be run at the smallest possible RAM and heap sizes when HNSW would otherwise cause swapping and grind to a halt. DiskBBQ largely scales linearly with the total RAM. And search performance is enhanced at scale as a subset of the total vector space is loaded. This requires an [Enterprise subscription](https://www.elastic.co/subscriptions).

`m`
: (Optional, integer) The number of neighbors each node will be connected to in the HNSW graph. Defaults to `16`. Only applicable to `hnsw`, `int8_hnsw`, `int4_hnsw` and `bbq_hnsw` index types.
Expand Down Expand Up @@ -390,6 +393,9 @@ $$$dense-vector-index-options$$$
: In case a knn query specifies a `rescore_vector` parameter, the query `rescore_vector` parameter will be used instead.
: See [oversampling and rescoring quantized vectors](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) for details.
:::::

`on_disk_rescore` {applies_to}`stack: preview 9.3` {applies_to}`serverless: unavailable`
: (Optional, boolean) Only applicable to quantized HNSW and `bbq_disk` index types. When `true`, vector rescoring will read the raw vector data directly from disk, and will not copy it in memory. This can improve performance when vector data is larger than the amount of available RAM. This setting only applies to newly-indexed vectors; after changing this setting, the vectors must be reindexed or force-merged to apply the new setting to the whole index. Defaults to `false`.
::::


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,7 @@ public class CcsCommonYamlTestSuiteIT extends ESClientYamlSuiteTestCase {
.setting("xpack.license.self_generated.type", "trial")
.feature(FeatureFlag.TIME_SERIES_MODE)
.feature(FeatureFlag.SYNTHETIC_VECTORS)
.feature(FeatureFlag.DOC_VALUES_SKIPPER)
.feature(FeatureFlag.GENERIC_VECTOR_FORMAT);
.feature(FeatureFlag.DOC_VALUES_SKIPPER);

private static ElasticsearchCluster remoteCluster = ElasticsearchCluster.local()
.name(REMOTE_CLUSTER_NAME)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,6 @@ public class RcsCcsCommonYamlTestSuiteIT extends ESClientYamlSuiteTestCase {
.setting("xpack.security.remote_cluster_client.ssl.enabled", "false")
.feature(FeatureFlag.TIME_SERIES_MODE)
.feature(FeatureFlag.SYNTHETIC_VECTORS)
.feature(FeatureFlag.GENERIC_VECTOR_FORMAT)
.feature(FeatureFlag.DOC_VALUES_SKIPPER)
.user("test_admin", "x-pack-test-password");

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ public class SmokeTestMultiNodeClientYamlTestSuiteIT extends ESClientYamlSuiteTe
.feature(FeatureFlag.DOC_VALUES_SKIPPER)
.feature(FeatureFlag.SYNTHETIC_VECTORS)
.feature(FeatureFlag.RANDOM_SAMPLING)
.feature(FeatureFlag.GENERIC_VECTOR_FORMAT)
.build();

public SmokeTestMultiNodeClientYamlTestSuiteIT(@Name("yaml") ClientYamlTestCandidate testCandidate) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ public class ClientYamlTestSuiteIT extends ESClientYamlSuiteTestCase {
.feature(FeatureFlag.DOC_VALUES_SKIPPER)
.feature(FeatureFlag.SYNTHETIC_VECTORS)
.feature(FeatureFlag.RANDOM_SAMPLING)
.feature(FeatureFlag.GENERIC_VECTOR_FORMAT)
.build();

public ClientYamlTestSuiteIT(@Name("yaml") ClientYamlTestCandidate testCandidate) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@
import org.apache.lucene.tests.util.LuceneTestCase;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.core.Strings;
import org.elasticsearch.index.codec.vectors.es93.ES93GenericFlatVectorsFormat;
import org.elasticsearch.plugins.Plugin;
import org.elasticsearch.search.vectors.KnnSearchBuilder;
import org.elasticsearch.search.vectors.VectorData;
Expand Down Expand Up @@ -60,8 +59,6 @@ public static void checkSupported() {
} catch (IOException e) {
SUPPORTED = false;
}

assumeTrue("Generic format supporting direct IO not enabled", ES93GenericFlatVectorsFormat.GENERIC_VECTOR_FORMAT.isEnabled());
}

static DirectIODirectory open(Path path) throws IOException {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,7 @@ private static Version parseUnchecked(String version) {
public static final IndexVersion SECURITY_MIGRATIONS_METADATA_FLATTENED_UPDATE = def(9_048_0_00, Version.LUCENE_10_3_2);
public static final IndexVersion STANDARD_INDEXES_USE_SKIPPERS = def(9_049_0_00, Version.LUCENE_10_3_2);
public static final IndexVersion NESTED_PATH_LIMIT = def(9_050_0_00, Version.LUCENE_10_3_2);
public static final IndexVersion GENERIC_DENSE_VECTOR_FORMAT = def(9_051_0_00, Version.LUCENE_10_3_2);

/*
* STOP! READ THIS FIRST! No, really,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@
import org.apache.lucene.codecs.hnsw.FlatVectorsWriter;
import org.apache.lucene.index.SegmentReadState;
import org.apache.lucene.index.SegmentWriteState;
import org.elasticsearch.common.util.FeatureFlag;
import org.elasticsearch.index.codec.vectors.AbstractFlatVectorsFormat;
import org.elasticsearch.index.codec.vectors.DirectIOCapableFlatVectorsFormat;
import org.elasticsearch.index.mapper.vectors.DenseVectorFieldMapper;
Expand All @@ -25,8 +24,6 @@

public class ES93GenericFlatVectorsFormat extends AbstractFlatVectorsFormat {

public static final FeatureFlag GENERIC_VECTOR_FORMAT = new FeatureFlag("generic_vector_format");

static final String NAME = "ES93GenericFlatVectorsFormat";
static final String VECTOR_FORMAT_INFO_EXTENSION = "vfi";
static final String META_CODEC_NAME = "ES93GenericFlatVectorsFormatMeta";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,7 @@

import org.elasticsearch.features.FeatureSpecification;
import org.elasticsearch.features.NodeFeature;
import org.elasticsearch.index.codec.vectors.es93.ES93GenericFlatVectorsFormat;

import java.util.HashSet;
import java.util.Set;

import static org.elasticsearch.index.mapper.vectors.DenseVectorFieldMapper.RESCORE_VECTOR_QUANTIZED_VECTOR_MAPPING;
Expand All @@ -26,11 +24,11 @@
*/
public class MapperFeatures implements FeatureSpecification {

public static final NodeFeature CONSTANT_KEYWORD_SYNTHETIC_SOURCE_WRITE_FIX = new NodeFeature(
static final NodeFeature CONSTANT_KEYWORD_SYNTHETIC_SOURCE_WRITE_FIX = new NodeFeature(
"mapper.constant_keyword.synthetic_source_write_fix"
);

public static final NodeFeature COUNTED_KEYWORD_SYNTHETIC_SOURCE_NATIVE_SUPPORT = new NodeFeature(
static final NodeFeature COUNTED_KEYWORD_SYNTHETIC_SOURCE_NATIVE_SUPPORT = new NodeFeature(
"mapper.counted_keyword.synthetic_source_native_support"
);

Expand Down Expand Up @@ -61,15 +59,15 @@ public class MapperFeatures implements FeatureSpecification {
);
static final NodeFeature EXCLUDE_VECTORS_DOCVALUE_BUGFIX = new NodeFeature("mapper.exclude_vectors_docvalue_bugfix");
static final NodeFeature BASE64_DENSE_VECTORS = new NodeFeature("mapper.base64_dense_vectors");
public static final NodeFeature GENERIC_VECTOR_FORMAT = new NodeFeature("mapper.vectors.generic_vector_format");
public static final NodeFeature FIX_DENSE_VECTOR_WRONG_FIELDS = new NodeFeature("mapper.fix_dense_vector_wrong_fields");
static final NodeFeature GENERIC_VECTOR_FORMAT = new NodeFeature("mapper.vectors.generic_vector_format");
static final NodeFeature FIX_DENSE_VECTOR_WRONG_FIELDS = new NodeFeature("mapper.fix_dense_vector_wrong_fields");
static final NodeFeature BBQ_DISK_STATS_SUPPORT = new NodeFeature("mapper.bbq_disk_stats_support");
static final NodeFeature SKIPPERS_ON_UNINDEXED_FIELDS = new NodeFeature("mapper.skippers_on_unindexed_fields");
static final NodeFeature STORED_FIELDS_SPEC_MERGE_BUG = new NodeFeature("mapper.stored_fields_spec_merge_bug");

@Override
public Set<NodeFeature> getTestFeatures() {
var features = Set.of(
return Set.of(
RangeFieldMapper.DATE_RANGE_INDEXING_FIX,
IgnoredSourceFieldMapper.DONT_EXPAND_DOTS_IN_IGNORED_SOURCE,
SourceFieldMapper.REMOVE_SYNTHETIC_SOURCE_ONLY_VALIDATION,
Expand Down Expand Up @@ -112,12 +110,8 @@ public Set<NodeFeature> getTestFeatures() {
FIX_DENSE_VECTOR_WRONG_FIELDS,
BBQ_DISK_STATS_SUPPORT,
SKIPPERS_ON_UNINDEXED_FIELDS,
STORED_FIELDS_SPEC_MERGE_BUG
STORED_FIELDS_SPEC_MERGE_BUG,
GENERIC_VECTOR_FORMAT
);
if (ES93GenericFlatVectorsFormat.GENERIC_VECTOR_FORMAT.isEnabled()) {
features = new HashSet<>(features);
features.add(GENERIC_VECTOR_FORMAT);
}
return features;
}
}
Loading
Loading