feat: add workflow implementation and postgres store by ankaisen · Pull Request #122 · NevaMind-AI/memU

ankaisen · 2025-12-19T06:02:46Z

No description provided.

Copilot

Pull request overview

This PR introduces a workflow-based architecture and PostgreSQL storage support for the memory service. It restructures the codebase to use a pipeline system for orchestrating memorize and retrieve operations, consolidates embedding functionality into the LLM client, and adds PostgreSQL/pgvector storage backends with Alembic migrations.

Key changes:

Implements workflow/pipeline abstraction for memorize and retrieve operations
Consolidates LLM and embedding clients (removes separate embedding config)
Adds PostgreSQL storage with pgvector support and Alembic migrations
Introduces scope-based memory isolation and multi-profile LLM support

Reviewed changes

Copilot reviewed 27 out of 32 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/memu/workflow/step.py	Defines workflow step abstraction and execution logic
src/memu/workflow/runner.py	Implements workflow runner interface and local execution backend
src/memu/workflow/pipeline.py	Provides pipeline management with versioning and validation
src/memu/workflow/init.py	Exports workflow public API
src/memu/models.py	Adds dynamic model building utilities for user/memory model composition
src/memu/llm/openai_sdk.py	Adds embed method to OpenAI SDK client
src/memu/llm/http_client.py	Consolidates embedding support into HTTP LLM client
src/memu/database/postgres/schema.py	Defines SQLModel schemas with pgvector support
src/memu/database/postgres/postgres.py	Implements PostgreSQL storage backend with vector search
src/memu/database/postgres/migrations/versions/0001_create_memory_tables.py	Creates initial database tables with pgvector extension
src/memu/database/postgres/migrations/env.py	Configures Alembic migration environment
src/memu/database/inmemory/repo.py	Refactors in-memory store to support model customization
src/memu/database/factory.py	Factory for initializing storage backends based on config
src/memu/app/settings.py	Consolidates embedding into LLM config, adds storage/scope config
src/memu/app/service.py	Restructures service to use workflows and scope-based isolation
src/memu/app/retrieve.py	Extracts retrieve logic into mixin with workflow-based implementation
src/memu/app/memorize.py	Extracts memorize logic into mixin with workflow-based implementation
pyproject.toml	Adds PostgreSQL and SQLAlchemy dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-19T06:03:42Z

src/memu/database/inmemory/models.py

+    base_model_cls = build_memory_model(user_model, BaseModel)
+    resource_model = build_memory_model(user_model, Resource)
+    memory_item_model = build_memory_model(user_model, MemoryItem)
+    memory_category_model = build_memory_model(user_model, MemoryCategory)
+    category_item_model = build_memory_model(user_model, CategoryItem)
+    return base_model_cls, resource_model, memory_item_model, memory_category_model, category_item_model


The variable name 'base_model_cls' is ambiguous as it suggests a base class when it's actually a merged model that combines user_model with BaseModel. Consider renaming to 'merged_base_model' or 'combined_base_model' to clarify its purpose.

Suggested change

base_model_cls = build_memory_model(user_model, BaseModel)

resource_model = build_memory_model(user_model, Resource)

memory_item_model = build_memory_model(user_model, MemoryItem)

memory_category_model = build_memory_model(user_model, MemoryCategory)

category_item_model = build_memory_model(user_model, CategoryItem)

return base_model_cls, resource_model, memory_item_model, memory_category_model, category_item_model

merged_base_model = build_memory_model(user_model, BaseModel)

resource_model = build_memory_model(user_model, Resource)

memory_item_model = build_memory_model(user_model, MemoryItem)

memory_category_model = build_memory_model(user_model, MemoryCategory)

category_item_model = build_memory_model(user_model, CategoryItem)

return merged_base_model, resource_model, memory_item_model, memory_category_model, category_item_model

Copilot · 2025-12-19T06:03:42Z

src/memu/database/postgres/postgres.py

+        *,
+        dsn: str,
+        scope_key: str,
+        ddl_mode: str = "create",  # kept for API compatibility


The ddl_mode parameter is marked as 'kept for API compatibility' but isn't actually used in the implementation. Consider either removing it or documenting why it's preserved if future use is planned.

Copilot · 2025-12-19T06:03:42Z

src/memu/database/postgres/migrations/versions/0001_create_memory_tables.py

+
+
+def _embedding_type(context) -> sa.types.TypeEngine:
+    use_vector = context.config.attributes.get("use_vector", True)


The default value for 'use_vector' is True, but the PostgresStore initializer determines this from vector_provider=='pgvector'. This could cause inconsistency if the default doesn't match the actual configuration. Consider aligning the default with the actual logic or making it required.

Suggested change

use_vector = context.config.attributes.get("use_vector", True)

use_vector = bool(context.config.attributes.get("use_vector"))

Copilot · 2025-12-19T06:03:43Z

src/memu/app/settings.py

+        if isinstance(data, dict) and "default" not in data:
+            data = dict(data)
+            data["default"] = LLMConfig()


The model_validator mutates the input dict on line 169-170. This can cause unexpected side effects if the same dict is reused. Consider always creating a new dict: data = dict(data) if isinstance(data, dict) else {} followed by data.setdefault('default', LLMConfig()).

Suggested change

if isinstance(data, dict) and "default" not in data:

data = dict(data)

data["default"] = LLMConfig()

if isinstance(data, dict):

# Always work on a copy to avoid mutating the caller's dict.

new_data = dict(data)

new_data.setdefault("default", LLMConfig())

return new_data

Copilot · 2025-12-19T06:03:43Z

src/memu/app/retrieve.py

+    ) -> list[dict[str, Any]]:
+        """Use LLM to rank memory items from relevant categories"""
+        if not category_ids:
+            print("[LLM Rank Items] No category_ids provided")


Using print() for logging instead of the logger instance. This should use logger.debug() or logger.warning() for consistency with the rest of the codebase.

Suggested change

print("[LLM Rank Items] No category_ids provided")

logger.warning("[LLM Rank Items] No category_ids provided")

Co-authored-by: sairin1202 <952141617@qq.com>

init workflow implementation

099e5a8

Copilot AI review requested due to automatic review settings December 19, 2025 06:02

Copilot AI reviewed Dec 19, 2025

View reviewed changes

ankaisen and others added 5 commits December 19, 2025 16:04

fix scope

1d5f2f0

fix: resolve runtime errors in examples and core modules (#123)

656433e

Co-authored-by: sairin1202 <952141617@qq.com>

fix database layer

e664204

delete unnecessary design.md

3f6cf73

fix user context

ff511b1

ankaisen changed the title ~~feat: init workflow implementation and postgres store~~ Dec 22, 2025

ankaisen merged commit a175811 into main Dec 22, 2025
2 checks passed

ankaisen deleted the feat/workflow branch December 22, 2025 18:03

github-actions bot mentioned this pull request Dec 13, 2025

chore(main): release 0.8.0 #116

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add workflow implementation and postgres store#122

feat: add workflow implementation and postgres store#122
ankaisen merged 6 commits intomainfrom
feat/workflow

ankaisen commented Dec 19, 2025

Copilot AI left a comment

Copilot AI Dec 19, 2025

Copilot AI Dec 19, 2025

Copilot AI Dec 19, 2025

Copilot AI Dec 19, 2025

Copilot AI Dec 19, 2025

Uh oh!

Labels

2 participants



		def _embedding_type(context) -> sa.types.TypeEngine:
		use_vector = context.config.attributes.get("use_vector", True)

	use_vector = context.config.attributes.get("use_vector", True)
	use_vector = bool(context.config.attributes.get("use_vector"))

-        if isinstance(data, dict) and "default" not in data:
-            data = dict(data)
-            data["default"] = LLMConfig()
+        if isinstance(data, dict):
+            # Always work on a copy to avoid mutating the caller's dict.
+            new_data = dict(data)
+            new_data.setdefault("default", LLMConfig())
+            return new_data

	print("[LLM Rank Items] No category_ids provided")
	logger.warning("[LLM Rank Items] No category_ids provided")

Conversation

ankaisen commented Dec 19, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Labels

2 participants