Skip to content

ChromaDB MAX_CHUNK_SIZE does not take into account platform-dependent limits #20142

@lennyerik

Description

@lennyerik

According to the ChromaDB Cookbook Documentation on batching, the underlying version of SQLite enforces a limit on the maximum number of documents that can be inserted at once. Since it is variable, this parameter is exposed on the client as chroma_client.get_max_batch_size() in new versions of the chromadb python package.

The code for ChromaVectorStore does not take this dynamic limit into account and instead hardcodes a value of 41665 for the chunking logic:

This lead to the following exception on my machine, while trying to insert many documents into the ChromaVectorStore:

chromadb.errors.InternalError: ValueError: Batch size of 6401 is greater than max batch size of 5461

To verify, I checked the local output of chromadb_client.get_max_batch_size() on my machine and it came back as 5461.

Installed versions for future reference:

sqlite --version :: 3.50.4 2025-07-30 19:33:53 4d8adfb30e03f9cf27f800a2c1ba3c48fb4ca1b08b0f5ed59a4d5ecbf45ealt1 (64-bit)
python --version :: Python 3.13.7
chromadb python package version :: 1.2.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions