Skip to content

⚑🧠 Vectro+ β€” High-Performance Embedding Engine in Rust πŸ¦€πŸ’Ύ Compress, quantize, and accelerate vector search πŸš€ Boost retrieval speed, cut memory, keep semantic precision 🎯πŸ”₯

License

Notifications You must be signed in to change notification settings

wesleyscholl/vectro-plus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

πŸš€ Vectro+

High-Performance Embedding Compression & Search Toolkit

Rust Version Tests License

╦  ╦╔═╗╔═╗╔╦╗╦═╗╔═╗  ╦ ╦
β•šβ•—β•”β•β•‘β•£ β•‘   β•‘ ╠╦╝║ β•‘  ╬═╣
 β•šβ• β•šβ•β•β•šβ•β• β•© β•©β•šβ•β•šβ•β•  β•© β•©

πŸ—œοΈ 75-90% Compression β€’ ⚑ Sub-ms Search β€’ 🌐 Web UI + REST API

A Rust-first toolkit for streaming compression, scalar quantization, and blazing-fast similarity search of large embedding datasets.

Quick Start β€’ Features β€’ Benchmarks β€’ Web UI β€’ Docs


Demo

VectroPlusDemo

✨ Features

  • πŸ—œοΈ Streaming Compression: Process datasets larger than RAM
  • πŸ“¦ Quantization: Reduce size by 75-90% with minimal accuracy loss
  • ⚑ Fast Search: Parallel cosine similarity with optimized indexing
  • 🌐 Web UI: Beautiful interactive dashboard with real-time search
  • πŸ”Œ REST API: Production-ready HTTP endpoints for integration
  • πŸ“Š Benchmarking: Criterion integration with HTML reports and delta tracking
  • πŸ”„ Multiple Formats: STREAM1 (f32) and QSTREAM1 (u8 quantized)
  • 🎨 Beautiful CLI: Progress bars, colored output, and streaming logs
  • 🎬 Video-Ready: Enhanced demo scripts perfect for presentations

🎬 Quick Demo

Terminal Demo

# Clone and run the enhanced interactive demo
git clone https://github.com/yourorg/vectro-plus
cd vectro-plus
./demo_enhanced.sh

Web UI Demo

# Start the web server
cargo run --release -p vectro_cli -- serve --port 8080

# Open http://localhost:8080 in your browser
# Beautiful dashboard with real-time search!

What you'll see:

πŸš€ Vectro+ Interactive Demo
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Step 1: Creating sample embeddings...
βœ“ Created 16 semantic embeddings (fruits 🍎, vehicles πŸš—, colors πŸ”΄)

Step 2: Streaming compression...
βœ“ Created dataset.bin (VECTRO+STREAM1 format)

Step 3: Quantization (size reduction)...
βœ“ Created dataset_q.bin (QSTREAM1 format)
πŸ’Ύ Space savings: 75%

Step 4: Semantic search...
Query: Searching for fruits 🍎
  β†’ 1. 🍎 apple -> 1.000000
  β†’ 2. 🍊 orange -> 0.987234
  β†’ 3. 🍌 banana -> 0.956789

Step 5: Interactive web UI...
πŸš€ Server starting on http://localhost:8080
πŸ“Š Dashboard with real-time metrics
πŸ” Search interface with instant results

πŸ“Ή Recording a demo video? See QUICKSTART_VIDEO.md for a complete guide!

⚑ Quick Start

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Getting Started with Vectro+                               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
# 1��⃣ Clone and build
git clone https://github.com/wesleyscholl/vectro-plus
cd vectro-plus
cargo build --release

# 2️⃣ Run interactive demo (recommended!)
./demo_enhanced.sh

# 3️⃣ Run comprehensive tests
cargo test --workspace

# 4️⃣ Start web UI
./target/release/vectro_cli serve --port 8080
# Open http://localhost:8080 in your browser

# 5️⃣ Run benchmarks
cargo bench -p vectro_lib --summary

🎯 Usage Examples

Web Server (NEW! 🌐)

Start an interactive web server:

# Start server
vectro serve --port 8080

# Open http://localhost:8080 in your browser

Web UI Features:

  • πŸ“Š Real-time stats dashboard
  • πŸ” Interactive semantic search
  • πŸ“€ Upload embeddings via drag-and-drop
  • πŸ’Ύ Load pre-compressed datasets
  • ⚑ Sub-millisecond query times displayed
  • 🎨 Beautiful gradient design

REST API:

# Health check
curl http://localhost:8080/health

# Get statistics
curl http://localhost:8080/api/stats

# Search embeddings
curl -X POST http://localhost:8080/api/search \
  -H "Content-Type: application/json" \
  -d '{"query": [0.1, 0.2, 0.3], "k": 10}'

Compress Embeddings

# Regular streaming format
vectro compress embeddings.jsonl dataset.bin

# With quantization (75%+ smaller)
vectro compress embeddings.jsonl dataset_q.bin --quantize

Search

# Find top-10 most similar vectors
vectro search "0.1,0.2,0.3,0.4,0.5" --top-k 10 --dataset dataset.bin

Benchmarks

# Run with summary and HTML report
vectro bench --summary --open-report

# Run specific benchmarks
vectro bench --bench-args "--bench cosine"

# Save report for sharing
vectro bench --save-report ./reports --summary

πŸ“Š Benchmark Output Example

Benchmark summaries:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ benchmark                   β”‚     median β”‚       mean β”‚ unit β”‚  delta β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ cosine_search/top_k_10      β”‚   123.456  β”‚   125.789  β”‚  ns  β”‚  -2.3% β”‚
β”‚ cosine_search/top_k_100     β”‚  1234.567  β”‚  1256.890  β”‚  ns  β”‚  +1.8% β”‚
β”‚ quantize/dataset_1000       β”‚ 45678.901  β”‚ 46789.012  β”‚  ns  β”‚    -   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“Š HTML summary saved to: target/criterion/vectro_summary.html

πŸ—οΈ Architecture

vectro-plus/
β”œβ”€β”€ vectro_lib/          # Core library (embeddings, search, quantization)
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   └── lib.rs       # Embedding, Dataset, SearchIndex, QuantizedIndex
β”‚   └── benches/         # Criterion benchmarks
β”œβ”€β”€ vectro_cli/          # CLI application
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ lib.rs       # compress_stream() with parallel pipeline
β”‚   β”‚   └── main.rs      # CLI: compress, search, bench, serve
β”‚   └── tests/           # Integration tests
β”œβ”€β”€ DEMO.md              # Comprehensive usage examples
β”œβ”€β”€ QSTREAM.md           # Binary format documentation
└── demo.sh              # Interactive demo script

οΏ½ Benchmarks & Quality

╔══════════════════════════════════════════════════════════════════╗
β•‘                      Performance Metrics                         β•‘
╠══════════════════════════════════════════════════════════════════╣
β•‘                                                                  β•‘
β•‘  Compression:      75-90% size reduction  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘  β•‘
β•‘  Search (top-10):  45-156 ΞΌs latency      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘   β•‘
β•‘  Search (top-100): 420 ΞΌs - 1.8 ms        β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘     β•‘
β•‘  Throughput:       Parallel pipeline      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘  β•‘
β•‘                                                                  β•‘
╠══════════════════════════════════════════════════════════════════╣
β•‘                      Quality Dashboard                           β•‘
╠══════════════════════════════════════════════════════════════════╣
β•‘                                                                  β•‘
β•‘  Accuracy Loss:      < 0.5%                                      β•‘
β•‘  Compression Ratio:  3.5x - 10x                                  β•‘
β•‘  Format Overhead:    Minimal (header only)                       β•‘
β•‘  Memory Efficiency:  Streaming I/O for large datasets            β•‘
β•‘                                                                  β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
πŸ“ˆ View detailed benchmarks by dataset size
Dataset Size Compress Quantize Search (top-10) Search (top-100)
10K Γ— 128d 5 MB 180ms 220ms 45ΞΌs 420ΞΌs
100K Γ— 768d 300 MB 3.2s 4.1s 123ΞΌs 1.2ms
1M Γ— 768d 3 GB 34s 43s 156ΞΌs 1.8ms

Benchmarked on M1 Max (10-core), parallel workers enabled

πŸ“ Format Documentation

STREAM1 (Regular)

Header: "VECTRO+STREAM1\n"
Records: [u32 length][bincode(Embedding)] Γ— N

QSTREAM1 (Quantized)

Header: "VECTRO+QSTREAM1\n"
Tables: [u32 count][u32 dim][u32 len][bincode(Vec<QuantTable>)]
Records: [u32 length][bincode((id, Vec<u8>))] Γ— N

See QSTREAM.md for complete specification.

πŸ§ͺ Testing

╔═══════════════════════════════════════════════════════════════╗
β•‘              πŸ§ͺ Test Coverage                                 β•‘
╠═══════════════════════════════════════════════════════════════╣
β•‘                                                               β•‘
β•‘  Total Tests:    10/10 passing  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  β•‘
β•‘  vectro_lib:     5/5 passing    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  β•‘
β•‘  vectro_cli:     5/5 passing    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  β•‘
β•‘  Warnings:       0               β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  β•‘
β•‘                                                               β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
# All tests
cargo test --workspace

# Specific crate
cargo test -p vectro_lib
cargo test -p vectro_cli

# Integration tests
cargo test -p vectro_cli --test integration_quantize

# With output
cargo test -- --nocapture
πŸ“‹ View test categories
  • βœ… Core Operations - Embedding management, dataset operations
  • βœ… Search Index - Cosine similarity, top-K results, batch queries
  • βœ… Quantization - Roundtrip accuracy, compression ratios
  • βœ… Storage - Binary format save/load, streaming I/O
  • βœ… Integration - End-to-end compression and search workflows

🀝 Contributing

Contributions welcome! Please:

  1. Fork the repo
  2. Create a feature branch (git checkout -b feature/amazing)
  3. Add tests for new functionality
  4. Run cargo fmt and cargo clippy
  5. Submit a PR

πŸ“š Resources

πŸ“„ License

MIT License - see LICENSE for details

πŸ™ Acknowledgments

Built with:

  • Rust - Systems programming language
  • Criterion - Statistical benchmarking
  • Rayon - Data parallelism
  • Bincode - Binary serialization
  • Clap - Command-line parsing

Ready to optimize your embeddings? Run ./demo.sh to get started! πŸš€

This repository contains a workspace with two crates:

  • vectro_lib β€” core library
  • vectro_cli β€” command-line tool

See docs/architecture.md for design notes.

πŸ“Š Project Status

Status: βœ… Production Ready (v1.0.0)

  • Core Features: Complete - streaming compression, quantization, fast search
  • Web UI: Fully functional with real-time search
  • REST API: Production-ready endpoints
  • Test Coverage: 10/10 passing (integration tests included)
  • Performance: Sub-ms search, 75-90% compression validated
  • Documentation: Comprehensive with video demos

Current Capabilities

  • βœ… Process datasets larger than RAM via streaming
  • βœ… 75-90% size reduction with minimal accuracy loss
  • βœ… Interactive web UI with beautiful visualizations
  • βœ… RESTful API for system integration
  • βœ… Parallel search with SIMD optimizations
  • βœ… Multiple file formats (STREAM1, QSTREAM1)
  • βœ… Criterion benchmarking with HTML reports

πŸ—ΊοΈ Roadmap

v1.1 (In Progress)

  • πŸ”„ Additional quantization methods (4-bit, binary)
  • πŸ”„ GPU acceleration for large batches
  • πŸ”„ Incremental index updates
  • πŸ”„ Export to vector database formats (Qdrant, Weaviate)

v1.2 (Planned)

  • πŸ“‹ Python bindings via PyO3
  • πŸ“‹ Node.js bindings via Neon
  • πŸ“‹ CLI improvements: progress persistence, resume capability
  • πŸ“‹ Advanced similarity metrics (Euclidean, Hamming)

v2.0 (Future)

  • πŸ“‹ Distributed compression for multi-TB datasets
  • πŸ“‹ Real-time streaming quantization pipeline
  • πŸ“‹ Integration with Apache Arrow for zero-copy
  • πŸ“‹ Cloud deployment templates (Docker, K8s)
  • πŸ“‹ WebAssembly version for browser-based compression

🎯 Next Steps

  1. For Users:

    • Try the demo: ./demo_enhanced.sh
    • Integrate REST API into your pipeline
    • Benchmark with your embeddings
  2. For Contributors:

    • See CONTRIBUTING.md for guidelines
    • Check Issues for "good first issue" labels
    • Join discussions about v1.1 features
  3. For Production:

    • Review docs/deployment.md for best practices
    • Monitor performance with included metrics
    • Set up alerts for API health

🀝 Contributing

We welcome contributions! Areas needing help:

  • Additional quantization methods
  • Performance optimizations
  • Documentation improvements
  • Example integrations with popular vector DBs

See CONTRIBUTING.md for details.