Introduce an adaptive HNSW Patience collector by tteofili · Pull Request #138685 · elastic/elasticsearch

tteofili · 2025-11-26T16:52:05Z

This introduces an extension of Lucene's HnswQueueSaturationCollector that avoids any static parameters for patience and saturation threshold.
HnswQueueSaturationCollector patience parameter depends on the k param, which is also manipulated by our query API, because of num_candidates, making one such static param less controllable.
Instead of a static queue saturation and patience setting, this collector accumulates a smoothed discovery rate and an adaptive saturation threshold based on discovery rate mean and stdDev.
This is likely to work better with different doc to doc and query to vector distributions.

tteofili · 2025-11-26T16:52:28Z

Buildkite benchmark this with so-vector please

tteofili · 2025-11-26T17:03:48Z

baseline (early_termination=false)

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         1.47              5.67           3.86  680.27    0.97  40448.13                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.25              0.67           2.62  3937.01    0.93  8683.86                1.00           true                 3.00
quora-E5-small         hnsw                0.000         1.14              4.16           3.65  877.19    0.97  35952.75                1.00           true                 3.00

baseline (early_termination=true with Lucene's defaults)

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         1.50              5.67           3.78  666.67    0.97  40448.13                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.26              0.71           2.74  3846.15    0.93  8608.64                1.00           true                 3.00
quora-E5-small         hnsw                0.000         1.21              4.33           3.58  826.45    0.97  35952.75                1.00           true                 3.00

baseline (early_termination=true with p=max(7,k*0.1), s=0.995, see #130564 (comment))

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         1.26              4.55           3.61  793.65    0.97  31300.27                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.23              0.59           2.54  4310.34    0.93  7431.86                1.00           true                 3.00
quora-E5-small.        hnsw                0.000         1.04              3.56           3.42  961.54    0.97  27188.93                1.00           true                 3.00

candidate

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count      QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  -------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         0.85              2.46           2.89  1176.47    0.97  16543.96                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.24              0.57           2.40  4201.68    0.92  6755.88                1.00           true                 3.00
quora-E5-small         hnsw                0.000         1.80              2.52           1.40  555.56    0.96  14069.28                1.00           true                 3.00

the candidate is much faster and more lightweight (much less visited) than the current collector with tweaked params, although with 1% recall loss (with 3x oversampling) on bbq_hnsw (current default).

…ptive_patience_collector

benwtrent · 2025-12-02T12:14:40Z

@tteofili I wonder if the adaptive collection is impacted by quantization loss...Consider as we lose information, the distances might be harder to distinguish.

An adaptive collector makes much more sense to me than a static value, and the performance impact here (with just 1% recall change), is very interesting indeed! Looks like a worthwhile investigation.

…ptive_patience_collector

tteofili · 2025-12-03T11:08:39Z

adding some more experiments for least quantized / unquantized HNSW.

hnsw

baseline

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         3.32             20.77           6.26  301.20    1.00  38185.25                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.70              3.94           5.61  1424.50    1.00  8525.93                1.00           true                 3.00
quora-E5-small         hnsw                0.000         1.95             10.30           5.28  512.82    0.99  34262.71                1.00           true                 3.00

baseline (early_termination=true with Lucene's defaults)

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         2.47             13.94           5.64  404.86    1.00  26590.62                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.65              3.57           5.51  1543.21    1.00  8090.53                1.00           true                 3.00
quora-E5-small         hnsw                0.000         1.81              8.06           4.45  552.49    0.99  23787.85                1.00           true                 3.00

baseline (early_termination=true with p=max(7,k*0.1), s=0.995

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         1.84              9.65           5.24  543.48    1.00  18529.00                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.59              2.94           4.99  1694.92    0.99  6469.23                1.00           true                 3.00
quora-E5-small         hnsw                0.000         1.31              5.11           3.90  763.36    0.99  16568.72                1.00           true                 3.00

candidate

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         1.62              8.01           4.94  617.28    1.00  15601.90                1.00           true                 3.00
fiqa-en.docs.          hnsw                0.000         0.77              3.36           4.36  1298.70    0.99  6544.63                1.00           true                 3.00
quora-E5-small        hnsw                0.000         1.44              5.50           3.82  694.44    0.98  13227.69                1.00           true                 3.00

int8_hnsw

baseline

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         2.45             12.50           5.10  408.16    1.00  38092.98                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.57              2.34           4.13  1760.56    1.00  8517.96                1.00           true                 3.00
quora-E5-small         hnsw                0.000         1.75              6.68           3.82  571.43    1.00  34263.00                1.00           true                 3.00

baseline (early_termination=true with Lucene's defaults)

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         1.92              8.64           4.50  520.83    1.00  26505.97                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.49              2.06           4.20  2040.82    1.00  8093.01                1.00           true                 3.00
quora-E5-small         hnsw                0.000         1.25              4.38           3.50  800.00    0.99  16385.81                1.00           true                 3.00

baseline (early_termination=true with p=max(7,k*0.1), s=0.995

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         1.49              6.02           4.04  671.14    1.00  18397.55                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.45              1.32           2.94  2232.14    0.99  6476.45                1.00           true                 3.00
quora-E5-small         hnsw                0.000         1.06              3.03           2.86  943.40    0.99  16385.81                1.00           true                 3.00

candidate

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         1.29              4.84           3.75  775.19    0.99  15550.11                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.44              1.64           3.72  2272.73    0.98  6402.80                1.00           true                 3.00
quora-E5-small         hnsw                0.000         0.81              2.30           2.84  1234.57    0.98  13018.14                1.00           true                 3.00

int4_hnsw

baseline

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         3.93             20.09           5.11  254.45    1.00  38296.08                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.77              3.49           4.54  1302.08    0.99  8568.21                1.00           true                 3.00
quora-E5-small         hnsw                0.000         2.32             10.03           4.32  431.03    0.99  34442.25                1.00           true                 3.00

baseline (early_termination=true with Lucene's defaults)

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         2.80             13.98           4.99  357.14    1.00  26799.29                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.72              2.99           4.16  1388.89    0.99  8138.12                1.00           true                 3.00
quora-E5-small         hnsw                0.000         2.18              8.73           4.00  458.72    0.99  24107.82                1.00           true                 3.00

baseline (early_termination=true with p=max(7,k*0.1), s=0.995

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         2.81             11.41           4.06  355.87    1.00  18696.71                1.00           true                 3.00
fiqa-en.docs           hnsw                0.000         0.58              2.31           3.97  1718.21    0.99  6541.55                1.00           true                 3.00
quora-E5-small         hnsw                0.000         1.32              4.79           3.63  757.58    0.99  16651.42                1.00           true                 3.00

candidate

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         1.74              7.96           4.57  574.71    0.99  15695.13                1.00           true                 3.00
fiqa-en.docs          hnsw                0.000         0.59              2.36           3.97  1683.50    0.98  6500.19                1.00           true                 3.00
quora-E5-small         hnsw                0.000         1.15              3.96           3.44  869.57    0.98  13306.62                1.00           true                 3.00

…i/elasticsearch into hnsw_adaptive_patience_collector

benwtrent · 2025-12-03T15:20:16Z

Its frankly pretty amazing how recall is almost exactly the same with more than 2x fewer vectors visited.

tteofili · 2025-12-03T16:30:20Z

I'm also running some experiments to see how this behaves across filtering selectivity.

…ptive_patience_collector

tteofili · 2025-12-05T14:11:28Z

the filter results are inline with the non filtered experiments above (max 2% recall at ~2x less visited).

elasticsearchmachine · 2025-12-05T14:12:13Z

Hi @tteofili, I've created a changelog YAML for you.

elasticsearchmachine · 2025-12-05T14:12:13Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

…i/elasticsearch into hnsw_adaptive_patience_collector

benwtrent · 2025-12-05T21:35:24Z

I want to run a benchmark...but once I am done with that, I will respond back.

The numbers are very encouraging. It seems like a no brainer to me :)

benwtrent · 2025-12-06T15:08:50Z

For larger scoring vectors (e.g. max inner product), I am seeing a recall drop of 5% fairly consistently across the board (the story is the same with various quantization levels):

index_name                      index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
------------------------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------
cohere-wikipedia-docs-768d.vec        hnsw                0.000         4.86              0.00           0.00  205.76    0.92  14215.24                1.00           true                 0.00
cohere-wikipedia-docs-768d.vec        hnsw                0.000         9.54              0.00           0.00  104.82    0.95  21314.09                1.00           true                 0.00
cohere-wikipedia-docs-768d.vec        hnsw                0.000         9.61              0.00           0.00  104.06    0.97  30024.02                1.00           true                 0.00

vs

index_name                      index_type  quantized_bits  num_candidates  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
------------------------------  ----------  --------------  --------------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------
cohere-wikipedia-docs-768d.vec        hnsw              32             100                0.000         3.82              0.00           0.00  261.78    0.88  11822.33                1.00           true                 0.00
cohere-wikipedia-docs-768d.vec        hnsw              32             250                0.000         4.50              0.00           0.00  222.22    0.90  14721.81                1.00           true                 0.00
cohere-wikipedia-docs-768d.vec        hnsw              32             500                0.000         5.15              0.00           0.00  194.17    0.92  17172.22                1.00           true                 0.00

benwtrent · 2025-12-06T20:06:48Z

Here are all the runs in a row. 4M vectors over 16 segments.

I am not sure about the graph density or anything, let me know if I can provide additional info.

index_name                      index_type  quantized_bits  num_candidates  early_termination  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  oversampling_factor
------------------------------  ----------  --------------  --------------  -----------------  -----------  ----------------  -------------  ------  ------  --------  -------------------
cohere-wikipedia-docs-768d.vec        hnsw              32             100              false         4.42              0.00           0.00  226.24    0.92  14215.24                 0.00
cohere-wikipedia-docs-768d.vec        hnsw              32             100               true         3.50              0.00           0.00  285.71    0.88  11822.33                 0.00
cohere-wikipedia-docs-768d.vec        hnsw              32             250              false         6.55              0.00           0.00  152.67    0.95  21314.09                 0.00
cohere-wikipedia-docs-768d.vec        hnsw              32             250               true         4.27              0.00           0.00  234.19    0.90  14721.81                 0.00
cohere-wikipedia-docs-768d.vec        hnsw              32             500              false         8.43              0.00           0.00  118.62    0.97  30024.02                 0.00
cohere-wikipedia-docs-768d.vec        hnsw              32             500               true         4.93              0.00           0.00  202.84    0.92  17172.22                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               7             100              false         3.55              0.00           0.00  281.69    0.86  14343.65                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               7             100               true         2.84              0.00           0.00  352.11    0.83  11934.26                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               7             250              false         5.25              0.00           0.00  190.48    0.88  21521.75                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               7             250               true         3.59              0.00           0.00  278.55    0.85  14926.82                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               7             500              false         7.38              0.00           0.00  135.50    0.89  30337.36                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               7             500               true         4.02              0.00           0.00  248.76    0.86  17309.01                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               4             100              false         8.35              0.00           0.00  119.76    0.59  20520.01                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               4             100               true         7.43              0.00           0.00  134.59    0.59  18509.20                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               4             250              false        12.47              0.00           0.00   80.19    0.60  30289.52                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               4             250               true         9.37              0.00           0.00  106.72    0.59  23279.30                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               4             500              false        17.41              0.00           0.00   57.44    0.60  41979.97                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               4             500               true        10.99              0.00           0.00   90.99    0.59  26711.21                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               1             100              false         4.41              0.00           0.00  226.76    0.91  24878.19                 3.00
cohere-wikipedia-docs-768d.vec        hnsw               1             100               true         3.20              0.00           0.00  312.50    0.87  16579.68                 3.00
cohere-wikipedia-docs-768d.vec        hnsw               1             250              false         4.69              0.00           0.00  213.22    0.91  24878.19                 3.00
cohere-wikipedia-docs-768d.vec        hnsw               1             250               true         3.14              0.00           0.00  318.47    0.87  16579.68                 3.00
cohere-wikipedia-docs-768d.vec        hnsw               1             500              false         5.70              0.00           0.00  175.44    0.92  31921.15                 3.00
cohere-wikipedia-docs-768d.vec        hnsw               1             500               true         3.38              0.00           0.00  295.86    0.88  18435.73                 3.00
cohere-wikipedia-docs-768d.vec        hnsw               1             100              false         2.46              0.00           0.00  406.50    0.69  15067.47                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               1             100               true         2.17              0.00           0.00  460.83    0.68  12525.45                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               1             250              false         3.97              0.00           0.00  251.89    0.70  22534.66                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               1             250               true         2.62              0.00           0.00  381.68    0.69  15655.45                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               1             500              false         5.48              0.00           0.00  182.48    0.70  31621.15                 0.00
cohere-wikipedia-docs-768d.vec        hnsw               1             500               true         3.18              0.00           0.00  314.47    0.69  18135.73                 0.00

EDIT:

I also did this same dataset with a force-merge to test an extreme case. Good news is that recall in multi-segment is still higher than single segment, even with the more aggressive early termination checks. However, it does seem to indicate to me that maybe we don't do early termination when there are very few segments, or a single segment...

Single segment recall (obviously, ignore qps and latency...focus on visited vs. recall):

index_name                      index_type  quantized_bits  num_candidates  early_termination  latency(ms)  net_cpu_time(ms)  avg_cpu_count      QPS  recall  visited  oversampling_factor
------------------------------  ----------  --------------  --------------  -----------------  -----------  ----------------  -------------  -------  ------  -------  -------------------
cohere-wikipedia-docs-768d.vec        hnsw              32             100              false         0.55              0.00           0.00  1818.18    0.82  1737.21                 0.00
cohere-wikipedia-docs-768d.vec        hnsw              32             100               true         0.35              0.00           0.00  2857.14    0.73  1055.95                 0.00
cohere-wikipedia-docs-768d.vec        hnsw              32             250              false         0.97              0.00           0.00  1030.93    0.90  3649.02                 0.00
cohere-wikipedia-docs-768d.vec        hnsw              32             250               true         0.39              0.00           0.00  2564.10    0.78  1363.25                 0.00
cohere-wikipedia-docs-768d.vec        hnsw              32             500              false         1.80              0.00           0.00   555.56    0.93  6600.57                 0.00
cohere-wikipedia-docs-768d.vec        hnsw              32             500               true         0.50              0.00           0.00  2000.00    0.83  1788.34                 0.00

benwtrent · 2025-12-08T15:06:17Z

One other thing I an SLIGHTLY concerned about is making sure if our "visited/recall" curve is improved with the change. My concern is that we are just visiting less, but in visiting less, we just get the same relative recall drop. So ultimately, this doesn't significantly help things anymore than what we are doing now.

pmpailis

LGTM 🚀 - running some benchmarks as well, but seems a reasonable change.

server/src/main/java/org/elasticsearch/search/vectors/AdaptiveHnswQueueSaturationCollector.java

…ptive_patience_collector

…i/elasticsearch into hnsw_adaptive_patience_collector

benwtrent · 2025-12-12T16:33:50Z

I reran with the current adaptive approach utilizing k, and my numbers are way better. Of course, latency is not the same, but the recall is much better within the ball-park:

index_name                                                      index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count      QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
--------------------------------------------------------------  ----------  -------------------  -----------  ----------------  -------------  -------  ------  --------  ------------------  -------------  -------------------
target/knn_index/cohere-wikipedia-docs-768d.vec-16-200.index          hnsw                0.000         4.15              0.00           0.00   240.96    0.91  13566.30                1.00           true                 0.00
target/knn_index/cohere-wikipedia-docs-768d.vec-16-200.index          hnsw                0.000         6.82              0.00           0.00   146.63    0.95  19569.17                1.00           true                 0.00
target/knn_index/cohere-wikipedia-docs-768d.vec-16-200.index          hnsw                0.000         7.73              0.00           0.00   129.37    0.96  25971.29                1.00           true                 0.00
target/knn_index/cohere-wikipedia-docs-768d.vec-16-200-4.index        hnsw                0.000         8.89              0.00           0.00   112.49    0.59  20049.69                1.00           true                 0.00
target/knn_index/cohere-wikipedia-docs-768d.vec-16-200-4.index        hnsw                0.000        12.90              0.00           0.00    77.52    0.60  29012.23                1.00           true                 0.00
target/knn_index/cohere-wikipedia-docs-768d.vec-16-200-4.index        hnsw                0.000        17.07              0.00           0.00    58.58    0.59  38889.21                1.00           true                 0.00
target/knn_index/cohere-wikipedia-docs-768d.vec-16-200-1.index        hnsw                0.000         4.71              0.00           0.00   212.31    0.90  22960.01                1.00           true                 3.00
target/knn_index/cohere-wikipedia-docs-768d.vec-16-200-1.index        hnsw                0.000         4.61              0.00           0.00   216.92    0.90  22960.01                1.00           true                 3.00
target/knn_index/cohere-wikipedia-docs-768d.vec-16-200-1.index        hnsw                0.000         5.68              0.00           0.00   176.06    0.91  28407.71                1.00           true                 3.00
target/knn_index/cohere-wikipedia-docs-768d.vec-16-200-1.index        hnsw                0.000         2.85              0.00           0.00   350.88    0.69  14535.57                1.00           true                 0.00
target/knn_index/cohere-wikipedia-docs-768d.vec-16-200-1.index        hnsw                0.000         4.18              0.00           0.00   239.23    0.70  20868.47                1.00           true                 0.00
target/knn_index/cohere-wikipedia-docs-768d.vec-16-200-1.index        hnsw                0.000         5.70              0.00           0.00   175.44    0.70  28107.71                1.00           true                 0.00

benwtrent

My tests show that recall change is 0-1% over all quantization levels.

Additionally, for certain data distributions, the num visited improves significantly. I am all for this.

…i/elasticsearch into hnsw_adaptive_patience_collector

Introduce an adaptive HNSW Patience collector

ffaf705

elasticsearchmachine added the v9.3.0 label Nov 26, 2025

tteofili added 4 commits November 26, 2025 19:21

Merge branch 'main' of github.com:elastic/elasticsearch into hnsw_ada…

9eea490

…ptive_patience_collector

drop @lucene.experimental tag

c166c01

minor tweaks

0a7650e

Merge branch 'main' into hnsw_adaptive_patience_collector

0b2813f

tteofili added 2 commits December 2, 2025 14:03

Merge branch 'main' into hnsw_adaptive_patience_collector

da594a8

Merge branch 'main' of github.com:elastic/elasticsearch into hnsw_ada…

91c50c8

…ptive_patience_collector

tteofili added 3 commits December 3, 2025 12:13

Merge branch 'hnsw_adaptive_patience_collector' of github.com:tteofil…

0bf01ef

…i/elasticsearch into hnsw_adaptive_patience_collector

discovery rate bug fixed

b4a9b10

minor tweaks

e908014

tteofili and others added 6 commits December 3, 2025 17:30

minor

cea9b79

added unit test

c79301d

Merge branch 'main' of github.com:elastic/elasticsearch into hnsw_ada…

a5cf895

…ptive_patience_collector

[CI] Auto commit changes from spotless

f0b4fcb

Merge branch 'main' of github.com:elastic/elasticsearch into hnsw_ada…

0196c5b

…ptive_patience_collector

minor

b6dee22

tteofili marked this pull request as ready for review December 5, 2025 14:10

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Dec 5, 2025

tteofili added :Search Relevance/Vectors Vector search >enhancement labels Dec 5, 2025

elasticsearchmachine added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed needs:triage Requires assignment of a team area label labels Dec 5, 2025

Update docs/changelog/138685.yaml

ed3b115

tteofili added 2 commits December 5, 2025 15:34

test fix

eee88b7

Merge branch 'hnsw_adaptive_patience_collector' of github.com:tteofil…

3e705f2

…i/elasticsearch into hnsw_adaptive_patience_collector

tteofili and others added 2 commits December 11, 2025 19:09

update smoothing to follow recent rate, rate depends on k

373c3e1

[CI] Auto commit changes from spotless

0503023

pmpailis approved these changes Dec 12, 2025

View reviewed changes

pmpailis reviewed Dec 12, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/search/vectors/AdaptiveHnswQueueSaturationCollector.java Outdated Show resolved Hide resolved

tteofili added 3 commits December 12, 2025 17:18

dropped unused param

0e1078e

Merge branch 'main' of github.com:elastic/elasticsearch into hnsw_ada…

7ce2da1

…ptive_patience_collector

Merge branch 'hnsw_adaptive_patience_collector' of github.com:tteofil…

fdb8589

…i/elasticsearch into hnsw_adaptive_patience_collector

benwtrent approved these changes Dec 12, 2025

View reviewed changes

tteofili enabled auto-merge (squash) December 12, 2025 16:36

drop delegate

fc2c5db

tteofili disabled auto-merge December 12, 2025 16:44

tteofili enabled auto-merge (squash) December 12, 2025 16:44

tteofili and others added 4 commits December 12, 2025 18:19

test fix

f376998

[CI] Auto commit changes from spotless

2deb971

test fix

061026a

Merge branch 'hnsw_adaptive_patience_collector' of github.com:tteofil…

99994c4

…i/elasticsearch into hnsw_adaptive_patience_collector

benwtrent self-assigned this Dec 12, 2025

tteofili merged commit fd9f0ff into elastic:main Dec 12, 2025
35 checks passed

parkertimmins pushed a commit to parkertimmins/elasticsearch that referenced this pull request Dec 17, 2025

Introduce an adaptive HNSW Patience collector (elastic#138685)

449a571

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce an adaptive HNSW Patience collector#138685

Introduce an adaptive HNSW Patience collector#138685
tteofili merged 29 commits intoelastic:mainfrom
tteofili:hnsw_adaptive_patience_collector

tteofili commented Nov 26, 2025 •

edited

Loading

tteofili commented Nov 26, 2025

tteofili commented Nov 26, 2025 •

edited

Loading

benwtrent commented Dec 2, 2025

tteofili commented Dec 3, 2025 •

edited

Loading

benwtrent commented Dec 3, 2025

tteofili commented Dec 3, 2025

tteofili commented Dec 5, 2025

elasticsearchmachine commented Dec 5, 2025

elasticsearchmachine commented Dec 5, 2025

benwtrent commented Dec 5, 2025

benwtrent commented Dec 6, 2025

benwtrent commented Dec 6, 2025 •

edited

Loading

benwtrent commented Dec 8, 2025

pmpailis left a comment •

edited

Loading

Uh oh!

benwtrent commented Dec 12, 2025

benwtrent left a comment

Uh oh!

Labels

4 participants

Conversation

tteofili commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

tteofili commented Nov 26, 2025

tteofili commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

benwtrent commented Dec 2, 2025

tteofili commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

hnsw

int8_hnsw

int4_hnsw

benwtrent commented Dec 3, 2025

tteofili commented Dec 3, 2025

tteofili commented Dec 5, 2025

elasticsearchmachine commented Dec 5, 2025

elasticsearchmachine commented Dec 5, 2025

benwtrent commented Dec 5, 2025

benwtrent commented Dec 6, 2025

benwtrent commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

benwtrent commented Dec 8, 2025

pmpailis left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benwtrent commented Dec 12, 2025

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

4 participants

tteofili commented Nov 26, 2025 •

edited

Loading

tteofili commented Nov 26, 2025 •

edited

Loading

tteofili commented Dec 3, 2025 •

edited

Loading

benwtrent commented Dec 6, 2025 •

edited

Loading

pmpailis left a comment •

edited

Loading