Overview

Relevant source files

VCPToolBox is a model-agnostic AI middleware server that transforms standard OpenAI-compatible API requests into context-enriched, tool-enabled interactions. It implements a three-pillar architecture—AI inference, tool execution, and persistent memory—to enable autonomous agent behavior, sophisticated RAG-based knowledge retrieval, and distributed plugin orchestration.

Core Capabilities:

Tool Orchestration: 300+ official plugins with 6 execution protocols (static, preprocessor, synchronous, asynchronous, service, hybrid)
Memory System: Rust-powered vector database (Vexus-Lite) with TagMemo V5 algorithm for atomic-precision retrieval
Distributed Architecture: Star-topology plugin distribution with WebSocket-based RPC and transparent file proxying
Agent Autonomy: Self-heartbeat mechanisms, inter-agent communication, calendar-based scheduling, and dream introspection
Hot Reload: Live configuration updates without server restarts

Scope: This page covers high-level architecture, request flow, and subsystem interactions. For implementation details:

Installation: Quick Start Guide
Plugin development: Plugin System and Developing Plugins
RAG internals: Memory and Knowledge System
Agent setup: Agent System
API contracts: API Reference

Sources: README.md1-57 server.js1-100

System Architecture

VCPToolBox implements a layered architecture centered on three core pillars: AI inference, tool execution, and persistent memory. The system receives HTTP requests at /v1/chat/completions, processes them through multiple transformation stages, and orchestrates interactions between AI models, plugins, and knowledge bases.

High-Level Component Diagram

Analysis: This diagram shows VCPToolBox's layered architecture. The Core Server Layer handles authentication via Bearer tokens and IP blacklisting server.js292-303 routes requests through ChatCompletionHandler modules/chatCompletionHandler.js242-253 and orchestrates the PluginManager Plugin.js The Plugin Ecosystem provides five plugin types with different execution models. The Memory & Intelligence Layer implements TagMemo Wave v5 RAG Plugin/RAGDiaryPlugin/ meta-thinking chains, and dream introspection. The Storage Layer uses SQLite for structured data VectorStore/knowledge_base.sqlite USearch for vector indexes *.usearch files and the file system for documents. All components interact with External AI Services for inference and embeddings.

Sources: server.js1-303 modules/chatCompletionHandler.js242-253 Plugin.js KnowledgeBaseManager.js README.md42-310 Diagram 1 from system overview

The "Iron Triangle": AI-Tools-Memory

VCPToolBox's architecture centers on three synergistic pillars:

1. AI Inference Layer

The ChatCompletionHandler class modules/chatCompletionHandler.js242 manages all backend AI interactions through a unified interface supporting OpenAI, Claude, Gemini, and custom endpoints.

Key Components:

fetchWithRetry() modules/chatCompletionHandler.js99-144 - Exponential backoff retry logic for 500/503/429 status codes with apiRetries and apiRetryDelay configuration
ModelRedirectHandler modelRedirectHandler.js1-50 - Client-to-backend model substitution via redirect maps
StreamHandler / NonStreamHandler modules/handlers/streamHandler.js modules/handlers/nonStreamHandler.js - SSE streaming vs JSON response handling
ChinaModel1 Thinking Control modules/chatCompletionHandler.js337-352 - Automatic thinking: {type: "enabled"} injection for GLM/Qwen/DeepSeek models when ChinaModel1Cot=true

2. Tool Execution Layer

Plugins extend AI capabilities through six execution protocols: static, preprocessor, synchronous, asynchronous, service, and hybrid. The system parses <<<[TOOL_REQUEST]>>> delimiters in AI responses and orchestrates plugin invocations.

Key Components:

PluginManager Plugin.js - Plugin discovery via plugin-manifest.json, lifecycle management, and routing to local vs distributed executors
ToolExecutor modules/vcpLoop/toolExecutor.js1-100 - Tool call validation, parameter parsing, and result injection into message history
ToolCallParser modules/vcpLoop/toolCallParser.js - Robust parsing of VCP protocol with 「始」...「末」 delimiters and fallback to <start>...</end> syntax
WebSocketServer WebSocketServer.js - register_tools, execute_tool, and internal_request_file message handlers for distributed plugins
VCPToolCode System modules/captchaDecoder.js24-35 - Optional 6-digit auth code validation for high-privilege tools

3. Persistent Memory Layer

The Rust-powered Vexus-Lite engine rust-vexus-lite/ provides atomic-precision semantic search through the TagMemo Wave v5 algorithm.

Key Components:

RAGDiaryPlugin Plugin/RAGDiaryPlugin/ - Message preprocessor that injects ... blocks into system/assistant messages based on [[dbName::modifiers]] syntax
KnowledgeBaseManager KnowledgeBaseManager.js - Unified interface to SQLite (metadata) and USearch (vectors) with automatic index persistence
TagMemo Wave v5 - Four-phase pipeline:
1. EPA Module - Embedding Projection Analysis for logic depth, worldview gating, and resonance detection
2. Residual Pyramid - Gram-Schmidt orthogonalization for multi-level semantic decomposition
3. Shotgun Query - Fragmented query vectors for saturation-style retrieval
4. PSR (Polarization Semantic Rudder) - SVD-based topic modeling for deduplication and dialectic knowledge injection
MetaThinkingManager modules/metaThinkingManager.js - Recursive reasoning chains with theme-based auto-routing and vector fusion
AgentDream Plugin/AgentDream/ - Nighttime memory introspection with random seed selection and associative linking

Sources: README.md42-389 server.js72-95 modules/chatCompletionHandler.js242-301 Plugin.js KnowledgeBaseManager.js dailynote.md1-120

Request Processing Flow

The following diagram details the transformation pipeline from client request to AI response, mapping each stage to specific code entities.

Sources: modules/chatCompletionHandler.js254-700 server.js500-600 modules/messageProcessor.js14-52 Diagram 4 from high-level system overview

Key Subsystems

Configuration System

VCPToolBox implements a hierarchical configuration system with hot-reload via chokidar file watching:

Configuration File	Purpose	Hot Reload Mechanism
`config.env`	Global settings, API keys, ports, model routing	Manual restart (`pm2 restart`)
`Plugin/*/config.env`	Plugin-specific overrides merged with global config	Automatic via `PluginManager.reloadPlugins()`
`rag_params.json`	TagMemo V5 magic numbers (k, tagWeight, EPA thresholds)	Automatic via `RAGDiaryPlugin` internal watcher
`agent_map.json`	Agent alias → Agent/*.txt file mappings	Automatic via `agentManager.loadAgentMap()` modules/agentManager.js36-50
`preprocessor_order.json`	Execution sequence for message preprocessors	Automatic via `PluginManager` manifest watcher
`TVStxt/*.txt`	Advanced variable definitions (`{{Tar}}`, `{{Var}}`)	Automatic via `tvsManager.loadTVSFile()` modules/tvsManager.js

Hot Reload Flow:

chokidar detects plugin-manifest.json changes Plugin.js
PluginManager.reloadPlugins() calls shutdown() on all local plugins (preserves distributed plugins)
Clears internal maps: plugins, staticPlugins, messagePreprocessors, services
Re-scans Plugin/*/ directories and reinitializes plugins
Broadcasts plugins-reloaded event via WebSocket to connected clients WebSocketServer.js

Sources: config.env.example1-465 server.js67-228 modules/agentManager.js1-150 modules/tvsManager.js Plugin.js Diagram 6 from system overview

Authentication & Security

Four security layers protect different endpoints:

Bearer Token Authentication - Main API (/v1/*)
- Validated against process.env.Key in auth middleware server.js466-503
- Triggers handleApiError() on 401 responses server.js261-289
Basic Authentication - Admin Panel (/AdminPanel, /admin_api)
- Credentials: process.env.AdminUsername / AdminPassword server.js111-112
- Cookie-based session management server.js376-398
- Rate limiting: 5 failed attempts within 15 minutes → 30-minute IP block server.js86-422
- Public paths exempted: /AdminPanel/login.html, /AdminPanel/VCPLogo2.png, etc. server.js323-337
IP Blacklist System - Automatic threat mitigation
- Persistent storage: ip_blacklist.json server.js81-84
- Auto-ban trigger: 5 consecutive API errors from same IP server.js261-289
- Localhost exemption: 127.0.0.1 and ::1 never auto-banned server.js268-271
- Manual management via /admin_api/ip-blacklist endpoints
VCPToolCode System - Per-tool authorization
- 6-digit codes encrypted in Plugin/UserAuth/code.bin modules/captchaDecoder.js24-35
- Decryption via AES-256-CBC with HMAC-SHA256 integrity check modules/captchaDecoder.js5-50
- Validation in ToolExecutor if VCPToolCode=true modules/vcpLoop/toolExecutor.js
- Admin override via plugin-manifest.json requireAdmin flag

IP Tracking for Distributed Plugins:

Middleware logs all POST request IPs server.js240-258
WebSocketServer.findServerByIp() maps IPs to known distributed servers WebSocketServer.js
FileFetcherServer uses IP tracking for transparent file proxying FileFetcherServer.js

Sources: server.js78-503 modules/captchaDecoder.js1-50 modules/vcpLoop/toolExecutor.js README.md1195-1213

Admin Panel

A web-based management interface served at /AdminPanel provides:

System Monitor - CPU/memory metrics, PM2 status AdminPanel/system-monitor.html
Config Editor - Live editing of config.env with comment preservation admin_api/config
Plugin Manager - Enable/disable, description updates admin_api/plugins
RAG Tuner - Dynamic parameter adjustment for TagMemo V5 admin_api/rag-params
Knowledge Browser - File manager for dailynote/ directory admin_api/knowledge-browser

The panel uses Basic Auth with cookie-based session management server.js316-453

Sources: server.js316-459 README.md1018-1055 Diagram 5 from high-level system overview

Technology Stack

Core Runtime

Node.js (v18+) - Event-driven application server with async I/O
Express 4.x - HTTP framework with middleware pipeline and SSE streaming server.js231
PM2 - Process manager with cluster mode, log rotation, and auto-restart Dockerfile40-45 ecosystem.config.js

Data Layer

Structured Storage:

SQLite 3 with WAL mode - Transactional storage for chunks, tags, files, co-occurrence matrices
- Database file: VectorStore/knowledge_base.sqlite config.env.example122
- Schema: files, chunks, tags, knowledge_chunks tables KnowledgeBaseManager.js
- ACID guarantees for diary operations dailynote.md9

Vector Storage:

rust-vexus-lite (Rust + Node.js N-API bindings) - High-performance vector engine
- USearch library - SIMD-optimized HNSW graph indexing (cosine similarity)
- Index files: VectorStore/index_global_tag.usearch, index_diary_*.usearch config.env.example122-124
- EPA, Residual Pyramid, PSR modules compiled from Rust rust-vexus-lite/src/
- Lazy loading with 2-minute persistence delay config.env.example149-155

Full-Text Search:

Tantivy (Rust) - Inverted index for DeepMemo chat history retrieval
- jieba-rs tokenizer for Chinese segmentation dailynote.md40-41
- In-memory indexing for each conversation file Plugin/DeepMemo/

Communication Protocols

Protocol	Use Case	Implementation
HTTP/1.1	Client requests, backend AI API calls	`express.json()` body parser server.js236
SSE (Server-Sent Events)	Streaming AI responses to clients	`res.write()` in `StreamHandler` modules/handlers/streamHandler.js
WebSocket	Distributed plugin RPC, admin panel live updates	`ws` library in `WebSocketServer` WebSocketServer.js1-50
stdio (JSON-RPC)	Sandboxed plugin IPC via `child_process.spawn()`	`PluginManager._executeStdioPlugin()` Plugin.js
HTTP/2	Optional for backend AI APIs supporting multiplexing	Via `fetch()` API

Monitoring & Observability

RotatingLogger modules/logger.js14-130
- Size-based rotation (50MB per file)
- Daily archiving to DebugLog/archive/YYYY-MM-DD/
- Separate streams: server.log, error.log, debug.log
- Console override for unified output modules/logger.js132-150
chokidar - File system event monitoring
- Plugin manifest hot-reload Plugin.js
- Agent configuration updates modules/agentManager.js119-145
- RAG parameter tuning Plugin/RAGDiaryPlugin/
WebSocket Push Notifications
- VCPLog plugin streams to admin panel vcpInfoHandler.js
- VCPInfo broadcasts system events to clients vcpInfoHandler.js
- plugins-reloaded event on hot-reload WebSocketServer.js

Docker Infrastructure

Multi-stage build: dependencies → production Dockerfile1-50
Volume mounts for data persistence docker-compose.yml10-20
PM2 entrypoint for zero-downtime reloads Dockerfile45

Sources: server.js1-237 package.json Dockerfile1-50 docker-compose.yml modules/logger.js1-150 KnowledgeBaseManager.js1-100 WebSocketServer.js1-50 README.md359-707 dailynote.md9-43

Request Lifecycle Summary

A typical /v1/chat/completions request flows through these stages:

Phase	Code Entity	Purpose
1. Authentication	`app.use()` middleware server.js292-303	Validate `Authorization: Bearer Key` against `process.env.Key`
2. Route Dispatch	`app.post('/v1/chat/completions')` server.js	Extract `requestId`, create `AbortController`, register in `activeRequests` Map
3. Context Control	`contextManager.pruneMessages()` modules/contextManager.js	Trim message array to `contextTokenLimit` token budget
4. Model Redirection	`modelRedirectHandler.redirectModelForBackend()` modelRedirectHandler.js	Map client model IDs to backend model IDs per redirect config
5. Role Division (Initial)	`roleDivider.process(..., skipCount=1)` modules/roleDivider.js47	Split messages by `<<<[ROLE_DIVIDE_XXX]>>>` tags (skips first system message)
6. VCPTavern Priority	`pluginManager.executeMessagePreprocessor('VCPTavern')` modules/chatCompletionHandler.js398-406	Inject preset blocks from `VCPTavern/presets/*.json`
7. Variable Replacement	`messageProcessor.resolveAllVariables()` modules/messageProcessor.js14-239	Resolve placeholders in priority order: `{{Agent}}` → `{{Tar}}` → `{{Var*}}` → `{{VCPPluginName}}`
8. Preprocessor Pipeline	Sequential `executeMessagePreprocessor()` calls modules/chatCompletionHandler.js443-478	Run `RAGDiaryPlugin`, `MultiModalProcessor`, etc. per `preprocessor_order.json`
9. Media Processing	`ImageProcessor` cache + multimodal API modules/chatCompletionHandler.js479-523	Convert `image_url` to base64, cache in `imageCache/`, or call `MultiModalModel` API
10. AI Invocation	`fetchWithRetry(API_URL, processedBody)` modules/chatCompletionHandler.js99-144	POST to backend with retry logic (3 retries, 200ms delay)
11. Tool Loop	`ToolCallParser` + `ToolExecutor` modules/chatCompletionHandler.js608-748	Detect `<<<[TOOL_REQUEST]>>>`, execute plugin, inject result, recurse (max 5 loops)
12. RAG Refresh	`_refreshRagBlocksIfNeeded()` modules/chatCompletionHandler.js146-240	Update `<!-- VCP_RAG_BLOCK_START -->` blocks if `RAGMemoRefresh=true`
13. Response Handling	`StreamHandler` or `NonStreamHandler` modules/handlers/	Stream SSE chunks or return JSON based on `req.body.stream` flag

For detailed sequence diagrams and stage-by-stage analysis, see Request Processing Pipeline.

Sources: modules/chatCompletionHandler.js254-800 server.js500-700 modules/messageProcessor.js14-239 modules/contextManager.js modules/roleDivider.js47-280 Diagram 2 from system overview

Distributed Architecture

VCPToolBox supports horizontal scaling through a star-topology network:

Key Mechanisms:

Plugin Registration - Distributed nodes send register_tools with plugin manifests WebSocketServer.js
Tool Routing - PluginManager checks isDistributed flag and routes accordingly Plugin.js
File Fetching - FileFetcherServer retrieves remote files via WebSocket for tool parameters FileFetcherServer.js
Automatic Cleanup - Disconnected nodes trigger plugin deregistration WebSocketServer.js

Sources: README.md507-633 WebSocketServer.js VCPDistributedServer.js Diagram 1 from high-level system overview

This overview establishes the foundational understanding needed to explore VCPToolBox's subsystems in detail. For next steps:

Set up a development environment: Development Environment
Understand plugin types and protocols: Plugin System Architecture
Learn about the RAG memory system: Memory and RAG Architecture
Configure agents: Agent Configuration

Overview

System Architecture

High-Level Component Diagram

The "Iron Triangle": AI-Tools-Memory

1. AI Inference Layer

2. Tool Execution Layer

3. Persistent Memory Layer

Request Processing Flow

Key Subsystems

Configuration System

Authentication & Security

Admin Panel

Technology Stack

Core Runtime

Data Layer

Communication Protocols

Monitoring & Observability

Docker Infrastructure

Request Lifecycle Summary

Distributed Architecture

On this page