Improved CAGRA build parameter heuristics by achirkin · Pull Request #1448 · rapidsai/cuvs

achirkin · 2025-10-22T14:01:35Z

Changes to the build parameter heuristics:

Move the code from HNSW namespace to CAGRA namespace to avoid depending on HNSW target
Add one more variant of the heuristics: allow generating smaller graph to better match the performance of the HNSW-generated graph
Implement automatic switch between NN-Descent and IVF-PQ as the graph-build algorithms depending on the dataset size: NN-Descent tends to perform better on smaller-scale datasets

PR also include C and java bindings.
Resolves #1265

…rams

c/include/cuvs/neighbors/cagra.h

KyleFromNVIDIA

Approved trivial CMake changes

tfeher · 2025-10-28T09:14:52Z

cpp/src/neighbors/cagra.cpp

+                                                   cuvs::distance::DistanceType metric)
+{
+  cagra::index_params params;
+  params.graph_degree              = 2 + M * 2 / 3;


The hard am variant sets graph_degree = 2 * M, it is surprising to see that soft variant can lead to similar search performance with graph_degree < M. The benchmarks for 768 and 1536 dimension looked good. Was it also tested for smaller dimensional datasets?

Yes, I've tested it on DEEP-100M and glove datasets. The hard-M variant actually shows much higher recall and lower throughput for the same search 'ef' parameter (the QPS-recall curve is close to HNSW, but all points on it are 'shifted' towards higher recall and lower throughput).

tfeher

Thanks Artem for the PR, it looks good to me!

…istics

c/src/neighbors/cagra.cpp

mythrocks

Looks good to my eyes. Minor nit regarding javadoc for the new function.

java/cuvs-java/src/main/java/com/nvidia/cuvs/spi/CuVSProvider.java

java/cuvs-java/src/main/java/com/nvidia/cuvs/CagraIndexParams.java

java/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/JDKProvider.java

c/include/cuvs/neighbors/cagra.h

achirkin · 2025-11-03T14:45:06Z

/merge

Add the new heuristics as static factory functions of cagra::index_pa…

d93331c

…rams

achirkin self-assigned this Oct 22, 2025

achirkin requested a review from a team as a code owner October 22, 2025 14:01

achirkin added this to Vector Search, ML, & Data Mining Release Board Oct 22, 2025

achirkin added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Oct 22, 2025

github-project-automation bot moved this to Todo in Vector Search, ML, & Data Mining Release Board Oct 22, 2025

achirkin added 2 commits October 22, 2025 18:24

Mark new implementations inline to avoid duplicate symbols

e758a0c

Add the new functions to the C API

e873d3a

achirkin moved this from Todo to In Progress in Vector Search, ML, & Data Mining Release Board Oct 22, 2025

cjnolet reviewed Oct 22, 2025

View reviewed changes

c/include/cuvs/neighbors/cagra.h Outdated Show resolved Hide resolved

achirkin and others added 3 commits October 23, 2025 11:44

Merge branch 'main' into fea-cagra-hnsw-heuristics

b8f9781

Merge branch 'main' into fea-cagra-hnsw-heuristics

f15c01a

Fix types in C wrapper code

cb06d5a

achirkin requested a review from a team as a code owner October 23, 2025 14:20

CagraIndexParams from C/Cpp heuristics

582db6f

achirkin force-pushed the fea-cagra-hnsw-heuristics branch from ef02185 to 582db6f Compare October 23, 2025 14:24

rapidsai deleted a comment from copy-pr-bot bot Oct 23, 2025

Put the new non-template functions in a separate compilation unit

d0ddb1e

achirkin requested a review from a team as a code owner October 23, 2025 15:19

KyleFromNVIDIA approved these changes Oct 23, 2025

View reviewed changes

mayya-sharipova mentioned this pull request Oct 23, 2025

Adjust GPU graph building params elastic/elasticsearch#137074

Merged

tfeher reviewed Oct 28, 2025

View reviewed changes

tfeher approved these changes Oct 28, 2025

View reviewed changes

achirkin and others added 5 commits October 29, 2025 15:10

Merge branch 'main' into fea-cagra-hnsw-heuristics

8f0605b

Fix headers

be3f901

Merge branch 'main' into fea-cagra-hnsw-heuristics

5fa89d5

Merge branch 'main' into fea-cagra-hnsw-heuristics

ed73a65

Merge remote-tracking branch 'rapidsai/main' into fea-cagra-hnsw-heur…

927d206

…istics

achirkin mentioned this pull request Oct 31, 2025

Add Augmented Core Extraction Algorithm #1404

Merged

achirkin added 2 commits October 31, 2025 09:17

Fix bad code merge

15786b7

Make one function out of two

da8aa38

robertmaynard requested changes Oct 31, 2025

View reviewed changes

c/src/neighbors/cagra.cpp Outdated Show resolved Hide resolved

Put C++ helpers in C API implementation in the the unnamed namespace

3671205

achirkin requested a review from robertmaynard October 31, 2025 13:32

mythrocks approved these changes Oct 31, 2025

View reviewed changes

java/cuvs-java/src/main/java/com/nvidia/cuvs/spi/CuVSProvider.java Show resolved Hide resolved

java/cuvs-java/src/main/java/com/nvidia/cuvs/CagraIndexParams.java Show resolved Hide resolved

java/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/JDKProvider.java Outdated Show resolved Hide resolved

robertmaynard approved these changes Oct 31, 2025

View reviewed changes

benfred reviewed Oct 31, 2025

View reviewed changes

c/include/cuvs/neighbors/cagra.h Outdated Show resolved Hide resolved

achirkin added 2 commits November 3, 2025 11:45

Undo accidental comment reflow

8b4299e

Add documentation to the java wrapper

9b33fb7

achirkin force-pushed the fea-cagra-hnsw-heuristics branch from e601477 to eeb41ac Compare November 3, 2025 10:56

Prefix the enum value in the C api in lieu of C++ namespaces

6904106

achirkin force-pushed the fea-cagra-hnsw-heuristics branch from eeb41ac to 6904106 Compare November 3, 2025 10:56

Merge branch 'main' into fea-cagra-hnsw-heuristics

21dcc9f

rapids-bot bot merged commit d8fdd7d into rapidsai:main Nov 3, 2025
162 of 164 checks passed

github-project-automation bot moved this from In Progress to Done in Vector Search, ML, & Data Mining Release Board Nov 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved CAGRA build parameter heuristics#1448

Improved CAGRA build parameter heuristics#1448
rapids-bot[bot] merged 20 commits intorapidsai:mainfrom
achirkin:fea-cagra-hnsw-heuristics

achirkin commented Oct 22, 2025 •

edited

Loading

Uh oh!

KyleFromNVIDIA left a comment

tfeher Oct 28, 2025

achirkin Oct 28, 2025

tfeher left a comment

Uh oh!

mythrocks left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

achirkin commented Nov 3, 2025

Uh oh!

Labels

8 participants

Conversation

achirkin commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KyleFromNVIDIA left a comment

Choose a reason for hiding this comment

tfeher Oct 28, 2025

Choose a reason for hiding this comment

achirkin Oct 28, 2025

Choose a reason for hiding this comment

tfeher left a comment

Choose a reason for hiding this comment

Uh oh!

mythrocks left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

achirkin commented Nov 3, 2025

Uh oh!

Labels

8 participants

achirkin commented Oct 22, 2025 •

edited

Loading