Skip to content

MAY4VFX/agent-memory-mcp

Repository files navigation

Agent Memory MCP β€” Telegram to AI Agent pipeline

🧠 Agent Memory MCP

Persistent memory layer for Telegram AI agents

Connect your channels and chats β€” your agent will never forget.

Python 3.12 License: GPL-3.0 MCP TON OpenClaw

πŸ€– Telegram Bot Β· πŸ“¦ PyPI Package Β· 🐾 OpenClaw Skill Β· πŸ”Œ API Β· πŸ” Search Architecture


❓ What is this?

Agent Memory MCP gives any AI agent access to your Telegram history β€” without stuffing everything into the context window.

Here's how it works:

  1. You connect your Telegram account through @AgentMemoryBot
  2. You pick which sources to remember β€” channels, groups, entire folders, or specific topics
  3. The system indexes everything β€” downloads message history (configurable depth: 1 month to years), extracts entities, builds a knowledge graph
  4. You plug in your AI agent β€” via MCP protocol (pip install agent-memory-mcp) or REST API with an API key
  5. Your agent works with the full history β€” search, digests, decisions, context packages β€” all server-side, no context window limits

Your data stays on the server. The agent only gets what it asks for β€” relevant search results, structured summaries, or extracted decisions. Not the raw firehose.


🎯 What You Can Do

πŸ”¬ Deep Research

Search across years of channel history. Find specific facts, nuances, and posts by keywords. Analyze hundreds of posts on a topic with automatic map-reduce.

"Find all posts about wallet integration from the last year and extract key decisions"

"What did @alice say about the migration plan in March?"

"Analyze all discussions about performance issues across our dev channels"

The system doesn't just keyword-match β€” it understands semantics, follows entity relationships in the knowledge graph, and can process 500+ posts through LLM-powered analysis in a single query.

πŸ“‹ Smart Digest

You follow 50+ channels. There's no time to read everything. Your agent creates digests by day or week with:

  • 🏷️ Automatic topic clustering β€” posts grouped by theme, not just chronologically
  • πŸ”— Links to original messages β€” every claim links back to the source post in Telegram
  • πŸ“Š Engagement scoring β€” important discussions bubble up, noise gets filtered out
  • πŸ“ Concise summaries β€” LLM-generated, not just excerpts

Save hours every day. Get a structured overview of what matters across all your channels.

πŸ’¬ Work Chat Summaries

Missed a 200-message discussion in your team group? The agent extracts:

  • βœ… Decisions β€” what was agreed upon
  • πŸ“Œ Action items β€” who committed to doing what
  • ❓ Open questions β€” what's still unresolved
  • πŸ“… Timeline β€” when things happened

"What decisions were made in the dev chat this week?"

"List all action items from yesterday's discussion about the release"

πŸ€– Agent Context Packs

Your agent answers customer questions in a support chat? Connect the team's knowledge base channels β€” the agent will know how similar issues were resolved before, what decisions were made, and what context exists around the project.

One tool call β€” get_agent_context β€” returns a complete context package: relevant messages, entity graph, related decisions, and community summaries. Everything an agent needs to give a grounded answer.

πŸ”— Multi-Agent Memory

Multiple agents share a single memory layer. A research agent indexes and searches, a writing agent uses the results, a monitoring agent tracks new decisions β€” all through the same MCP server or API.

Any agent that speaks MCP or HTTP can plug in. Claude Desktop, Cursor, custom Python bots, OpenAI-based agents β€” doesn't matter.


βš™οΈ How It Works

1. Connect Telegram    β†’  Authorize via @AgentMemoryBot
2. Add sources         β†’  Channels, groups, folders, topics
3. System indexes      β†’  History download β†’ entity extraction β†’ graph building
4. Get API key         β†’  Create in bot, use in your agent
5. Agent queries       β†’  search / digest / decisions / context β€” all via API

The agent never sees raw messages. It gets processed, ranked, and structured results β€” with sources linked back to Telegram.


πŸ—οΈ Architecture

graph TB
    subgraph Sources["πŸ“± Telegram Sources"]
        CH[Channels]
        GR[Groups & Topics]
        FL[Folders]
    end

    subgraph Ingestion["βš™οΈ Ingestion Pipeline"]
        COL[Telethon Collector] --> NF[Noise Filter]
        NF --> ME[Metadata & Threading]
        ME --> EE[Entity Extraction]
        EE --> EMB[BGE-M3 Embedding]
    end

    subgraph Memory["🧠 Memory Engine"]
        PG["PostgreSQL + ParadeDB\nπŸ“ BM25 Full-Text"]
        MV["Milvus 2.5\n🧬 Dense + Sparse Vectors"]
        FK["FalkorDB\nπŸ•ΈοΈ Knowledge Graph"]
    end

    subgraph Interface["πŸ”Œ Agent Interface"]
        MCP["MCP Server\n(Streamable HTTP)"]
        REST["REST API\n(FastAPI)"]
        PKG["pip package\n(agent-memory-mcp)"]
    end

    subgraph Agents["πŸ€– Your Agents"]
        CL[Claude Desktop / Cursor]
        CUSTOM[Custom Bots & Agents]
        ANY[Any MCP Client]
    end

    Sources --> COL
    EMB --> PG & MV & FK
    Memory --> MCP & REST
    PKG -.->|thin client| REST
    MCP --> CL
    MCP --> ANY
    REST --> CUSTOM

    subgraph TON["πŸ’Ž TON Payments"]
        CR[Pay-per-query Points]
    end

    TON -.-> Interface
Loading

πŸ” Search Architecture

Not just "search over chats." Six layers of intelligent retrieval working together:

1. πŸ“ BM25 Full-Text Search β€” ParadeDB

Exact keyword matching inside PostgreSQL. Russian stemming support. When you need to find a specific word, name, or hashtag among thousands of messages.

Three-level fallback: ParadeDB BM25 β†’ PostgreSQL tsvector β†’ ILIKE. Always finds something.

2. 🧬 Vector Search β€” Milvus 2.5 + BGE-M3

Semantic search by meaning, not just keywords. Finds relevant content even when words don't match.

  • 1024-dim dense vectors (BGE-M3 via Text Embeddings Inference)
  • Built-in BM25 sparse vectors in Milvus β€” no separate index needed
  • Hybrid mode: dense + sparse results merged via Reciprocal Rank Fusion (RRF)

3. πŸ•ΈοΈ Knowledge Graph β€” FalkorDB

Entity-relationship graph built from your messages. Who is connected to whom? Which projects were mentioned together?

  • Entities & Relations extracted by LLM, stored in graph
  • Community detection (Leiden algorithm) β€” automatic grouping of related entities
  • Text2Cypher β€” ask a graph question in natural language, the system generates a Cypher query

4. βš–οΈ Cross-Encoder Reranker

BGE-reranker-v2-m3 re-scores combined results from all search engines. Sees the full (query, document) pair β€” much more precise than vector similarity alone.

5. πŸ”„ Corrective RAG (CRAG)

Self-correcting retrieval loop. If initial results score low on relevance:

  1. System detects low-quality results
  2. Query gets reformulated automatically
  3. New retrieval round runs
  4. Results merge with previous round

Up to 3 correction iterations in deep mode.

6. πŸ€– Agentic RAG

The LLM itself decides what to search and how. ReAct pattern with 8 available tools:

Tool What it does
keyword_search BM25 full-text in PostgreSQL
semantic_search Hybrid vector search in Milvus
graph_search Entity-based retrieval from FalkorDB
graph_query Natural language β†’ Cypher β†’ graph results
read_messages Load full message text by ID
rerank_results Cross-encoder re-ranking of accumulated results
analyze_large_set Map-reduce over 500+ posts
get_domain_info Domain metadata and schema

Budget-constrained: fast (4 steps), balanced (8 steps), deep (15 steps).

Three Pipeline Paths

Query arrives β†’ Self-RAG gate β†’ Route decision:

β”œβ”€ ⚑ Overview     Pre-computed summary exists β†’ instant answer
β”‚
β”œβ”€ πŸ“Š Cascaded    BM25 finds 30+ posts β†’ entity filter β†’ map-reduce (up to 500 posts)
β”‚
└─ πŸ” Standard    Parallel retrieve (BM25 + Vector + Graph + Hashtag)
                      β†’ Merge & dedup β†’ Rerank β†’ CRAG loop
                      β†’ Graph enrich β†’ Generate answer
                        └─ πŸ€– Agentic mode: LLM picks tools autonomously

🧩 Memory Primitives

Tool Points Description
πŸ” search_memory 3 Hybrid search with answer generation. Scope by channel, folder, or all sources
πŸ“‹ get_digest 25 Period digest (1d / 3d / 7d / 30d) with topic clustering and source links
βœ… get_decisions 12 Extract decisions, action items, and open questions from conversations
πŸ€– get_agent_context 15 Full context package: search + digest + graph + decisions in one call
πŸ”¬ analysis/deep 50 Deep analysis with map-reduce over hundreds of posts
βž• add_source free Connect a channel, group, or Telegram folder. Set sync depth (1m–1y)
πŸ“‚ list_sources free List all connected sources with message counts and sync status
πŸ“ list_folders free List your Telegram folders and their channels
πŸ”— check_telegram_auth free Check if your Telegram account is connected
πŸ“Š sync_status free Real-time ingestion progress for all sources
❌ remove_source free Disconnect a source and stop syncing

πŸš€ Quick Start

Method 1: MCP Package (Claude Desktop / Cursor)

pip install agent-memory-mcp

Add to your MCP config (claude_desktop_config.json or .cursor/mcp.json):

{
  "mcpServers": {
    "agent-memory": {
      "command": "agent-memory-mcp",
      "env": {
        "AGENT_MEMORY_API_KEY": "amk_your_key_here",
        "AGENT_MEMORY_URL": "https://agent.ai-vfx.com"
      }
    }
  }
}

Get your API key from @AgentMemoryBot β†’ πŸ”‘ API Keys β†’ Create.

Method 2: Streamable HTTP MCP

For MCP clients that support HTTP transport (Claude Code, etc.):

Endpoint: https://agent.ai-vfx.com/mcp
Auth: Bearer token (API key) or OAuth 2.0 with PKCE

Auto-discovery via /.well-known/oauth-authorization-server.

Method 3: REST API

# πŸ” Search memory
curl -X POST https://agent.ai-vfx.com/api/v1/memory/search \
  -H "Authorization: Bearer amk_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"query": "what decisions were made about wallet integration?"}'

# πŸ“‹ Get weekly digest
curl -X POST https://agent.ai-vfx.com/api/v1/digest \
  -H "Authorization: Bearer amk_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"scope": "@channel_name", "period": "7d"}'

# βœ… Get decisions
curl -X POST https://agent.ai-vfx.com/api/v1/decisions \
  -H "Authorization: Bearer amk_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"scope": "@team_chat", "topic": "release planning"}'

# πŸ’° Check balance
curl https://agent.ai-vfx.com/api/v1/account/balance \
  -H "Authorization: Bearer amk_your_key_here"

🐾 OpenClaw Skill

Agent Memory MCP is available as a ClawHub skill β€” install it in one click and use Telegram memory directly from any OpenClaw-compatible agent.

openclaw install may4vfx/telegram-agent-memory

Or add manually β€” the skill definition is in integrations/openclaw-skill/SKILL.md.

What the skill provides:

  • πŸ” Search across your Telegram channels and groups
  • πŸ“‹ Generate digests for any period
  • βœ… Extract decisions and action items
  • βž• Connect new sources on the fly

Self-onboarding: if you don't have an API key yet, the skill walks you through setup β€” open @AgentMemoryBot, connect Telegram, create a key, and you're ready.

πŸ”— Browse on ClawHub


πŸ€– Telegram Bot

@AgentMemoryBot β€” your control panel. Runs in forum mode with topic-based conversations.

Feature Description
πŸ”‘ API Keys Create up to 20 keys, view prefixes, revoke anytime
πŸ“± Sources Add channels / groups / folders, monitor sync progress and message counts
πŸ’° Balance Check points balance, view last transactions
πŸ’Ž Top Up Pay with TON directly from Tonkeeper or any TON wallet
πŸ“Š Usage Points spent by endpoint over 24 hours
❓ Help Quick start guide and integration instructions

πŸ’Ž TON Integration

Points System

  • 🎁 Welcome bonus: 100 free points for every new user (~33 searches)
  • πŸ’³ Pay-per-query: no subscriptions, pay only for what you use
  • πŸ’° Pricing: 1 point = $0.01 β€” TON conversion uses live CoinGecko rate

Top-Up Options

Amount Points (approx.)
0.5 TON ~165 pts
1 TON ~330 pts
3 TON ~990 pts
5 TON ~1,650 pts
10 TON ~3,300 pts

Points are calculated dynamically based on the real-time TON/USD rate.

How Top-Up Works

  1. Tap πŸ’Ž Top Up in the bot β†’ pick amount β†’ see live TON rate and exact points
  2. Deep link opens your TON wallet (Tonkeeper, etc.) with pre-filled amount and payment ID
  3. Send the transaction
  4. Backend detects the payment via TonCenter API (polling every 5s)
  5. Points added to your account instantly

βš™οΈ Ingestion Pipeline

What happens when you add a source:

πŸ“₯ Collection       Telethon multi-user collector, encrypted sessions in DB
       ↓
🧹 Noise Filter     Remove joins/leaves, service messages, empty content
       ↓
πŸ“ Metadata          Language detection, content type, timestamps
       ↓
🧡 Threading         Group replies into conversation threads, link forum topics
       ↓
🏷️ Entity Extract    LLM-based extraction of entities and relationships (parallel batches)
       ↓
🧬 Embedding         BGE-M3 dense vectors (1024-dim) via Text Embeddings Inference
       ↓
πŸ’Ύ Storage           Parallel write β†’ PostgreSQL + Milvus + FalkorDB
       ↓
πŸ•ΈοΈ Communities       Leiden algorithm clusters related entities in the graph
       ↓
πŸ” Schema Discovery  Auto-detect domain type, entity types, relation types

Supports channels, groups, supergroups with topics, and entire Telegram folders. Configurable sync depth: 1 month, 3 months, 6 months, 1 year.


πŸ“‹ Digest Engine

How digests are generated:

Messages (up to 5000) β†’ Engagement scoring (replies Γ— 3 + content length)
    β†’ Top-200 selection β†’ BGE-M3 embedding β†’ Cosine deduplication
    β†’ Semantic clustering β†’ Parallel LLM labeling (emoji + topic name)
    β†’ MAP: summarize each cluster β†’ REDUCE: synthesize final digest
    β†’ Format with links to original Telegram messages

Every fact in the digest links back to the original post β€” click through to see the full context in Telegram.


πŸ”’ Privacy & Future

Current state: LLM inference runs through a LiteLLM proxy (supports OpenAI, Anthropic, DeepSeek). Embeddings and reranking run on dedicated GPU servers with open-source models (BGE-M3, BGE-reranker).

Where we're heading β€” Cocoon integration:

Cocoon (Confidential Compute Open Network) is Telegram's native GPU compute layer built on TON β€” confidential AI inference powered by decentralized hardware. This is a natural next step for Agent Memory MCP:

  • 🧠 Confidential LLM inference β€” move all AI processing (extraction, reasoning, answer generation) to Cocoon's confidential compute. Your data never leaves the encrypted enclave β€” not even the GPU owner can see it
  • ⛏️ Decentralized GPU power β€” no dependency on centralized API providers. Cocoon GPU miners earn TON while powering your agent's memory pipeline
  • πŸ” Full Telegram-native stack β€” data flows entirely within the Telegram + TON ecosystem: Telegram messages β†’ Cocoon inference β†’ TON payments. Zero external dependencies
  • 🏠 Self-hosted deployment β€” the entire stack (embedding, LLM, storage, graph) is designed to run on your own infrastructure or on Cocoon's network

All components are modular and replaceable. Swap out any layer β€” LLM provider, embedding model, storage engine β€” without changing the agent interface. Today it works with any LLM via LiteLLM; tomorrow it runs natively on Cocoon.


πŸ› οΈ Tech Stack

Layer Technology
Runtime Python 3.12, FastAPI, uvicorn
Bot aiogram 3.x (forum topic mode)
Collector Telethon (multi-user, encrypted sessions)
Full-Text Search PostgreSQL + ParadeDB (BM25)
Vector Search Milvus 2.5 (dense + sparse hybrid, RRF)
Knowledge Graph FalkorDB (Cypher, Leiden community detection)
Embeddings BGE-M3 via Text Embeddings Inference
Reranker BGE-reranker-v2-m3 via Text Embeddings Inference
LLM LiteLLM proxy (3 tiers: extraction / reasoning / answer)
MCP FastMCP (Streamable HTTP + standalone pip package)
Payments TON via TonCenter API
Observability Langfuse (LLM tracing), structlog
Deployment Docker, Dokploy (auto-deploy on push)

πŸ“„ License

GPL-3.0 β€” see LICENSE

About

Persistent Telegram memory for AI agents β€” search, digests, decisions via MCP/REST API. TON payments. OpenClaw skill available.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages