fix: default embed size by sairin1202 · Pull Request #192 · NevaMind-AI/memU

sairin1202 · 2026-01-09T01:20:53Z

change default embed size from 25 to 1
openai api has the max embed size 25

Copilot

Pull request overview

This pull request changes the default embed_batch_size parameter from 25 to 1 across the codebase. According to the PR description, this is motivated by a claim that "openai api has the max embed size 25", however this appears to be incorrect based on OpenAI's actual API limits.

Key changes:

Reduced default embed_batch_size from 25 to 1 in the OpenAISDKClient constructor
Updated the corresponding default value in the LLMConfig settings class

Reviewed changes

Copilot reviewed 2 out of 4 changed files in this pull request and generated 2 comments.

File	Description
src/memu/llm/openai_sdk.py	Changed default embed_batch_size parameter from 25 to 1 in OpenAISDKClient.init
src/memu/app/settings.py	Changed default embed_batch_size field from 25 to 1 in LLMConfig class

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-09T01:23:30Z

src/memu/llm/openai_sdk.py

        chat_model: str,
        embed_model: str,
-        embed_batch_size: int = 25,
+        embed_batch_size: int = 1,


The change to reduce the default embed_batch_size from 25 to 1 appears problematic. According to the OpenAI API documentation, the embeddings endpoint supports batching up to 2048 inputs per request. The PR description states "openai api has the max embed size 25" which appears to be incorrect. Reducing the batch size to 1 would significantly increase the number of API calls required for embedding multiple texts, leading to increased latency and potentially higher costs. Unless there's a specific issue with batching that needs to be addressed, consider keeping the default at 25 or even increasing it for better performance.

Suggested change

embed_batch_size: int = 1,

embed_batch_size: int = 25,

Copilot · 2026-01-09T01:23:30Z

src/memu/app/settings.py

    )
    embed_batch_size: int = Field(
-        default=25,
+        default=1,


The change to reduce the default embed_batch_size from 25 to 1 appears problematic. According to the OpenAI API documentation, the embeddings endpoint supports batching up to 2048 inputs per request. The PR description states "openai api has the max embed size 25" which appears to be incorrect. Reducing the batch size to 1 would significantly increase the number of API calls required for embedding multiple texts, leading to increased latency and potentially higher costs. Unless there's a specific issue with batching that needs to be addressed, consider keeping the default at 25 or even increasing it for better performance.

Suggested change

default=1,

default=25,

fix: default embed size

8df0125

Copilot AI review requested due to automatic review settings January 9, 2026 01:20

Copilot started reviewing on behalf of sairin1202 January 9, 2026 01:21 View session

fix: lint error

c3e8dbf

sairin1202 merged commit 144fd32 into main Jan 9, 2026
3 checks passed

Copilot AI reviewed Jan 9, 2026

View reviewed changes

sairin1202 deleted the bug/default_embed_size branch January 9, 2026 01:23

github-actions bot mentioned this pull request Jan 9, 2026

chore(main): release 1.2.0 #193

Merged

sairin1202 mentioned this pull request Jan 11, 2026

[BUG] Retrieve(LLM-based) with qwen-max model not worked #188

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: default embed size#192

fix: default embed size#192
sairin1202 merged 2 commits intomainfrom
bug/default_embed_size

sairin1202 commented Jan 9, 2026

Uh oh!

Copilot AI left a comment

Copilot AI Jan 9, 2026

Copilot AI Jan 9, 2026

Labels

2 participants

Conversation

sairin1202 commented Jan 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Labels

2 participants