-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Description
According to the ChromaDB Cookbook Documentation on batching, the underlying version of SQLite enforces a limit on the maximum number of documents that can be inserted at once. Since it is variable, this parameter is exposed on the client as chroma_client.get_max_batch_size() in new versions of the chromadb python package.
The code for ChromaVectorStore does not take this dynamic limit into account and instead hardcodes a value of 41665 for the chunking logic:
Line 97 in 4a4bbae
| MAX_CHUNK_SIZE = 41665 # One less than the max chunk size for ChromaDB |
This lead to the following exception on my machine, while trying to insert many documents into the ChromaVectorStore:
chromadb.errors.InternalError: ValueError: Batch size of 6401 is greater than max batch size of 5461
To verify, I checked the local output of chromadb_client.get_max_batch_size() on my machine and it came back as 5461.
Installed versions for future reference:
sqlite --version :: 3.50.4 2025-07-30 19:33:53 4d8adfb30e03f9cf27f800a2c1ba3c48fb4ca1b08b0f5ed59a4d5ecbf45ealt1 (64-bit)
python --version :: Python 3.13.7
chromadb python package version :: 1.2.0