🔬 VMARO: Vectorless Multi-Agent Research Orchestrator

Feed it a research topic -> Get back a comprehensive thematic tree, parallel methodology evaluations, and a structured, funding-ready grant proposal evaluated for novelty. All without a vector database or embeddings layer.

Overview

VMARO is an advanced 8-stage, multi-agent AI pipeline orchestrating academic research and grant writing. Instead of the traditional, generic RAG mechanism (chunking texts and vector similarity), VMARO utilizes LLM-native structural synthesis to construct an interpretable "Thematic Tree" directly from multiple live academic sources.

The multi-model engine sequentially analyzes literature, detects emerging macro-trends, isolates critical research gaps, pits multiple methodologies against each other in a parallel "challenger" phase, formats the outcomes to specific institutional guidelines (e.g., NIH, NSF, ERC), and finally generates the full-bodied proposal with a quantified novelty score and PDF/LaTeX exports.

Key Features & Architecture Improvements

Vectorless Navigation: No FAISS, no ChromaDB. Replaces black-box semantic retrieval with direct semantic clustering, constructing a visual Thematic Tree directly from high-signal abstracts and metadata.
Intelligent Quality Gates: Built-in "LLM-as-a-Judge" layers validate outputs iteratively between stages. If data is shallow or hallucinatory, the gate will flag it (PASS, REVISE, FAIL).
Parallel Methodology Evaluation: VMARO doesn't just pick the first idea. It drafts a primary methodology, constructs a challenger counter-approach, and objectively evaluates which design has stronger statistical power and feasibility.
Intent-Aware Preprocessing: Raw user input — whether a phrase or a paragraph — is normalized into a structured payload with domain classification, query variants, and explicit research intent (survey_gaps, propose_methodology) before retrieval begins. Prevents garbage-in-garbage-out at the pipeline root.
Institutional Format Matching: Automatically restructures and tunes rhetorical tone to align with rigorous schemas (e.g., NSF, NIH, ERC) using a dedicated Format Matcher. You can upload custom JSON format templates as well.
Stateful Resiliency: All outputs cache natively via utils/cache.py. Process interrupted? The pipeline resumes immediately from the last checkpoint to save API credits.

The 8-Stage Pipeline

[Research Topic]
       ↓
 0️⃣  Topic Normalization       (Intent classification + query variant generation)
       ↓
 1️⃣  Literature Mining         (Multi-API Fetcher: arXiv, PubMed, Scholar + LLM)
       ↓
 2️⃣  Thematic Tree Builder     (Clusters into hierarchical themes) → 🛡️ [Quality Gate 1]
       ↓
 3️⃣  Trend Analysis            (Detects dominant/emerging signals)
       ↓
 4️⃣  Gap Identification        (Auto-detects and ranks multiple research gaps) → 🛡️ [Quality Gate 2]
       ↓
     [User Intervenes: Selects Gap or Defines Custom]
       ↓
 5️⃣  Methodology Evaluator     (Drafts Primary vs Challenger Methodologies -> Selects Winner)
       ↓
 6️⃣  Format Selection          (Matches winning approach to grant styles + User Override)
       ↓
 7️⃣  Grant Writing             (Detailed content generation constrained by format schema)
       ↓
 8️⃣  Novelty Scoring           (Coarse tree pass → Deep paper comparison → 0-100 Score)
       ↓
[Streamlit Dashboard / LaTeX PDF Export]

Dashboard Workflows in Action

1. Command Center / Overview Dashboard

2. Literature Mining & Corpus Generation

3. Thematic Tree Synthesis

4. Gap Identification & Selection

5. Parallel Methodology Evaluation

6. Generated Proposal & Novelty Scoring

Quickstart

1. Clone & Environment

git clone https://github.com/your-org/vmaro.git
cd vmaro

# Create and sync virtual environment
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

2. Configure API Keys

cp .env.example .env

Edit the .env to map your respective accounts. VMARO leverages multiple providers (Gemini / Groq / AWS) dynamically, handling round-robin request pools to bypass restrictive free-tier rate limits.

# Foundational LLMs
GROQ_API_KEY_1=your_key
GEMINI_4_AWS_KEY_1=your_key

# External sources (optional, standard use bypasses these if not provided)
SEMANTIC_SCHOLAR_KEY=

3. Run via CLI

To let the automated orchestrator handle everything programmatically:

python main.py --topic "Federated Learning in Bioinformatics"

Want to bypass the parallel methodology evaluation? Add the --no-parallel flag.

4. Interactive UI Mode (Recommended)

To utilize the dynamic visualizer (Agraph), manual gap selection intervention, and one-click Format/PDF generation:

streamlit run app.py

Open http://localhost:8501 in your browser.

Repository Structure

vmaro/
├── agents/
│   ├── literature_agent.py      # Agent 1: Multi-API Fetch & Consolidate
│   ├── tree_agent.py            # Agent 2: Hierarchical Clustinger
│   ├── trend_agent.py           # Agent 3: Macro-Signals Identification
│   ├── gap_agent.py             # Agent 4: Target Discovery
│   ├── methodology_agent.py     # Agent 5a: Method generation
│   ├── methodology_evaluator.py # Agent 5b: Primary vs Challenger eval
│   ├── format_matcher.py        # Agent 6: Matching proposal formats
│   ├── grant_agent.py           # Agent 7: Format-compliant Grant Writing
│   └── novelty_agent.py         # Agent 8: Score validation
├── utils/
│   ├── multi_api_fetcher.py     # Scholar, PubMed, Arxiv, CrossRef multiplexer
│   ├── schema.py                # Pydantic-like validations, LLM cleanup & Key rotation
│   ├── quality_gate.py          # Quality evaluator middleware
│   ├── format_loader.py         # Loads and registers JSON schemas for Grants
│   └── latex_exporter.py        # Converts generated outputs to PDF / Tex
├── app.py                       # Modern Streamlit UI application
├── main.py                      # CrewAI Orchestrator Execution script
└── ...

📫 Capabilities vs Limitations

Capabilities:

Deduplication: Multi-API fetches eliminate cross-source duplicates.
Robust Fail-Safes: All keys are iterated cyclically. clean_json_response() parses markdown-polluted LLM responses flawlessly.

Future Items:

Paper count is intentionally bounded at 20 to optimize token efficiency and maintain coherent thematic clustering — larger corpora dilute signal without improving output quality at current LLM context limits.
Deeper automated web-searching in the Methodology generation phase for specific up-to-date Python/R package implementations.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.streamlit		.streamlit
agents		agents
docs		docs
grant_formats		grant_formats
image		image
mock_data		mock_data
schemas_for_user		schemas_for_user
tests		tests
utils		utils
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
QUICKSTART.md		QUICKSTART.md
README.md		README.md
app.py		app.py
check_groq_models.py		check_groq_models.py
diagnostic.py		diagnostic.py
main.py		main.py
requirements.txt		requirements.txt
runtime.txt		runtime.txt
smoke_test.py		smoke_test.py
smoke_test_gemini.py		smoke_test_gemini.py
test.py		test.py
test_429_headers.py		test_429_headers.py
test_arxiv.py		test_arxiv.py
test_models.py		test_models.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔬 VMARO: Vectorless Multi-Agent Research Orchestrator

Overview

Key Features & Architecture Improvements

The 8-Stage Pipeline

Dashboard Workflows in Action

Quickstart

1. Clone & Environment

2. Configure API Keys

3. Run via CLI

4. Interactive UI Mode (Recommended)

Repository Structure

📫 Capabilities vs Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🔬 VMARO: Vectorless Multi-Agent Research Orchestrator

Overview

Key Features & Architecture Improvements

The 8-Stage Pipeline

Dashboard Workflows in Action

Quickstart

1. Clone & Environment

2. Configure API Keys

3. Run via CLI

4. Interactive UI Mode (Recommended)

Repository Structure

📫 Capabilities vs Limitations

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages