Improved CAGRA build parameter heuristics#1448
Improved CAGRA build parameter heuristics#1448rapids-bot[bot] merged 20 commits intorapidsai:mainfrom
Conversation
ef02185 to
582db6f
Compare
KyleFromNVIDIA
left a comment
There was a problem hiding this comment.
Approved trivial CMake changes
cpp/src/neighbors/cagra.cpp
Outdated
| cuvs::distance::DistanceType metric) | ||
| { | ||
| cagra::index_params params; | ||
| params.graph_degree = 2 + M * 2 / 3; |
There was a problem hiding this comment.
The hard am variant sets graph_degree = 2 * M, it is surprising to see that soft variant can lead to similar search performance with graph_degree < M. The benchmarks for 768 and 1536 dimension looked good. Was it also tested for smaller dimensional datasets?
There was a problem hiding this comment.
Yes, I've tested it on DEEP-100M and glove datasets. The hard-M variant actually shows much higher recall and lower throughput for the same search 'ef' parameter (the QPS-recall curve is close to HNSW, but all points on it are 'shifted' towards higher recall and lower throughput).
tfeher
left a comment
There was a problem hiding this comment.
Thanks Artem for the PR, it looks good to me!
mythrocks
left a comment
There was a problem hiding this comment.
Looks good to my eyes. Minor nit regarding javadoc for the new function.
java/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/JDKProvider.java
Outdated
Show resolved
Hide resolved
e601477 to
eeb41ac
Compare
eeb41ac to
6904106
Compare
|
/merge |
Changes to the build parameter heuristics:
PR also include C and java bindings.
Resolves #1265