This project is a sophisticated, AI-powered system designed to supervise, manage, and assist other AI agents. It has evolved from a simple monitoring script into a multi-layered platform with advanced capabilities for intelligent oversight and autonomous operation.
The system is built around three core concepts:
- Supervision: A supervisor agent that uses a probabilistic model (Expectimax) to watch over a working agent, predict potential issues, and intervene when necessary.
- Orchestration: An autonomous orchestrator that can manage a pool of specialized agents, decompose high-level goals into a dependency graph of tasks, and manage the entire execution workflow, including delegating complex tasks to sub-orchestrators.
- Assistance: A proactive research assistant that can detect when an agent is stuck, perform web searches to find solutions for its errors, and provide intelligent suggestions to help it recover.
This project includes a rich set of features, demonstrating a robust and intelligent architecture.
-
Intelligent Supervisor Agent:
- Uses an Expectimax algorithm (
supervisor_agent/expectimax_agent.py) to make nuanced decisions about whether toALLOW,WARN,CORRECT, orESCALATEan agent's output. This is not based on simple rules, but on a probabilistic model of future outcomes. - The decision-making is based on a weighted evaluation of the agent's state, including output quality, task drift, error count, and resource usage.
- Uses an Expectimax algorithm (
-
Code-Aware Supervision:
- The supervisor can now understand code quality. When an agent produces Python code, the system uses the
pylintstatic analysis tool (analysis/code_analyzer.py) to check for errors, code smells, and style issues. - The number of errors found is factored directly into the
AgentStatepassed to the Expectimax agent, making its decisions about code much more intelligent.
- The supervisor can now understand code quality. When an agent produces Python code, the system uses the
-
Multi-modal Supervision:
- The supervisor now has a new sense: vision. It can evaluate image-based outputs from agents.
- When an agent's output is an image URL, the
LLMJudgeuses a vision-capable model (e.g., Claude 3 Opus) to evaluate the image against the task goals.
-
Feedback-Driven Learning:
- The supervisor can learn from user feedback. The dashboard allows a human to correct a bad decision, and this feedback is used to retrain the weights of the Expectimax agent's evaluation function via
supervisor_agent/feedback_trainer.py. - This creates a powerful self-improvement loop, allowing the supervisor's judgment to get better over time.
- The supervisor can learn from user feedback. The dashboard allows a human to correct a bad decision, and this feedback is used to retrain the weights of the Expectimax agent's evaluation function via
-
Autonomous Orchestrator with Multi-LLM Support:
- Manages a pool of specialized agents with different capabilities.
- Features an LLM-powered task planner. The system is architected to use multiple LLM providers concurrently (e.g., Anthropic, OpenAI), loading its configuration from
config/llm_config.json. - Different models can be used for different tasks (e.g., a fast model for planning, a powerful model for judging) to optimize for cost and performance.
-
Sub-Orchestration:
- For extremely complex goals, the main orchestrator can now delegate tasks to sub-projects. The LLM planner is instructed to identify tasks that are themselves large projects and assign them a
sub_orchestrationcapability. - The orchestrator then creates a new, nested
ProjectGoaland monitors it, allowing for hierarchical, recursive problem-solving.
- For extremely complex goals, the main orchestrator can now delegate tasks to sub-projects. The LLM planner is instructed to identify tasks that are themselves large projects and assign them a
-
Resource-Aware Task Assignment:
- The orchestrator is now aware of agent system resources. Agents can report their CPU and memory load via a new API endpoint.
- The
find_available_agentlogic has been enhanced to filter out agents with high resource usage (e.g., >90%) and to prioritize assigning tasks to the least-loaded agent available.
-
The Agent Factory:
- The system is capable of building new agents for itself. When a "build agent" goal is submitted, the orchestrator creates a sub-project to manage a team of coding agents that write, test, and register a new agent into the pool.
-
Proactive Research Assistant:
- The supervisor can detect when an agent is "stuck". It then autonomously formulates a search query, uses Google Search to find relevant help articles, and uses an LLM to synthesize a helpful suggestion.
-
Cost Analysis:
- A
CostTrackerservice logs every LLM call made by the system. It uses model-specific pricing to calculate the cost of each call and provides a detailed report, which can be viewed on the dashboard.
- A
-
Interactive Dashboard with Real-Time Updates:
- A comprehensive web dashboard (
examples/dashboard.html) serves as the central UI. - The dashboard now features real-time log and status streaming via a dedicated WebSocket connection, making the UI highly responsive.
- It includes an interactive debugger that visualizes the Expectimax agent's entire decision tree as a flowchart.
- A comprehensive web dashboard (
The project is organized into a standard Python project structure:
src/supervisor_agent/: Contains the coreSupervisorCoreand theExpectimaxAgent.src/orchestrator/: Contains theOrchestratorand its data models.src/researcher/: Contains theResearchAssistor.src/analysis/: Contains theCodeQualityAnalyzer.src/agent_factory/: Contains templates for building new agents.src/llm/: Contains the multi-LLM client architecture (base.py,manager.py, etc.).src/server/: Contains the main ASGI server (main.py).examples/dashboard.html: The all-in-one web interface.tests/: Contains unit and integration tests.
To get the project running, follow these steps:
-
Set up a Python virtual environment:
- This project uses Python's standard
venvmodule.
python3 -m venv .venv
- This project uses Python's standard
-
Activate the virtual environment:
source .venv/bin/activate -
Install dependencies:
- Install all required packages using the
pipfrom your new virtual environment.
pip install -r requirements.txt
- Install all required packages using the
-
Configure API Keys (Optional):
- Create a
.envfile in the root directory or set environment variables for the LLM providers you wish to use.
ANTHROPIC_API_KEY="your-anthropic-key" OPENAI_API_KEY="your-openai-key"- If API keys are not set, the relevant clients will return mocked responses.
- Create a
-
Start the Server:
- The application is an ASGI web server and should be run with
uvicorn. The following command also sets thePYTHONPATHcorrectly, which is required for the application's imports to work.
PYTHONPATH=$(pwd)/src .venv/bin/uvicorn src.server.main:mcp --port 8765- You should see output from
uvicornindicating the server is running onhttp://127.0.0.1:8765.
- The application is an ASGI web server and should be run with
-
Use the Dashboard:
- Open the
examples/dashboard.htmlfile in your web browser. This file is self-contained and will connect to the local server automatically.
- Open the
This project has a rich roadmap for future development.
- Interactive Goal Definition: Create a UI for drag-and-drop task planning.
- Authentication & Multi-User: Add a proper user login system.
- The Ethics Guardian: A specialized supervisor to enforce an "ethical constitution."
- The Self-Improving Supervisor (Meta-Learning): A supervisor that learns from its own interventions to improve its policies.
- Predictive Intervention Engine: A system that analyzes an agent's work in real-time to predict and prevent failures.
- Full System Autonomy: Connect the orchestrator to external data streams to allow it to discover and propose its own goals.
- Meta-Supervision: A system that can analyze its own performance and autonomously refactor its own source code or prompts.
- Human-AI Symbiosis: Evolve the UI into a true collaborative partner with conversational planning and deeply explainable AI (XAI).
- Decentralized Swarm Orchestration: Move from a single orchestrator to a decentralized swarm of orchestrators.
- Embodied AI & Physical World Control: Connect the orchestrator to physical hardware (robotics, IoT).
This project was developed as part of the Minimax Agent Hackathon.
Lead AI Software Engineer: Jules