An open-ended, multi-model AI voice agent system for easy testing and launching voice agents into GoHighLevel (GHL).
This project provides a complete platform for creating, testing, and deploying AI-powered voice agents that integrate seamlessly with GoHighLevel's CRM and automation platform.
- Multi-Model AI Support: Works with Claude, GPT, Gemini, and custom models
- Real-Time Voice Processing: Sub-800ms voice-to-voice latency
- GHL Integration: Native integration with GoHighLevel API and Voice AI Custom Actions
- Model Context Protocol (MCP): Industry-standard integration framework
- Smart Cost Optimization: Intelligent routing to balance quality and cost
- Production-Ready Security: OAuth 2.1, TLS 1.3, comprehensive audit logging
- Open Architecture: Extensible and customizable for any use case
- Python 3.11+
- PostgreSQL 15+
- Redis 7+
- Docker & Docker Compose (recommended)
# Clone repository
git clone https://github.com/yourusername/voice-agents-ghl.git
cd voice-agents-ghl
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env with your API keys
# Run database migrations
alembic upgrade head
# Start development server
uvicorn backend.api.main:app --reload# Start all services
docker-compose up -d
# View logs
docker-compose logs -f
# Stop services
docker-compose down- Development Plan - Complete project overview and roadmap
- MCP Documentation - Model Context Protocol integration guide
- API Documentation - Interactive API docs (when server running)
- Backend: FastAPI (Python 3.11+) with async/WebSocket support
- Database: PostgreSQL with pgvector for embeddings
- Cache: Redis for session management
- Voice Processing: Deepgram (STT), ElevenLabs/Cartesia (TTS)
- LLM Integration: Claude, GPT, Gemini with smart routing
- Integration: Model Context Protocol (MCP) for external services
voice-agents-ghl/
├── backend/ # FastAPI backend
│ ├── api/ # API routes
│ ├── services/ # Business logic
│ │ ├── llm/ # Multi-model LLM integration
│ │ ├── voice/ # Voice processing (STT/TTS)
│ │ ├── agent_service.py # Voice agent logic
│ │ └── mcp_service.py # MCP client
│ ├── models/ # Data models
│ └── database/ # Database & migrations
├── mcp-servers/ # MCP servers
│ ├── ghl-server/ # GoHighLevel integration
│ └── voice-server/ # Voice processing tools
├── docs/ # Documentation
│ └── mcp/ # MCP documentation
└── tests/ # Tests
- Natural Conversations: Human-like voice interactions with sub-800ms latency
- Context Awareness: Maintains conversation history and context
- Multi-Model Intelligence: Routes to best LLM based on task complexity
- Real-Time Data Access: Accesses GHL contacts, appointments, and more during calls
- Contact Management: Create, update, search contacts
- Appointment Booking: Check availability, book appointments
- Voice AI Custom Actions: Real-time webhook calls during live conversations
- Bidirectional Webhooks: React to GHL events (contacts, appointments, opportunities)
- Custom Fields: Access and update GHL custom fields
- Claude 3.5 Sonnet/4: Complex reasoning and code generation
- GPT-4o/5: Structured outputs and multi-step debugging
- Gemini 2.5 Flash: Cost-effective, fast responses
- Local Models: Ollama, vLLM for privacy-sensitive use cases
Cost Optimization: Smart routing can reduce LLM costs by 60-80% while maintaining quality
- Authentication: OAuth 2.1, JWT tokens
- Encryption: TLS 1.3 (transport), AES-256 (at rest)
- Input Validation: Comprehensive sanitization and validation
- Audit Logging: Complete audit trail of all operations
- Rate Limiting: Protection against abuse
- Compliance: GDPR, HIPAA, PCI DSS ready
- Voice-to-Voice Latency: <800ms (industry standard 2025)
- Uptime: 99.9% monthly
- Concurrent Calls: 100+ per instance
- Error Rate: <0.1%
| Component | Target | Achieved |
|---|---|---|
| STT Response | <500ms | TBD |
| LLM TTFT | <500ms | TBD |
| TTS Generation | <300ms | TBD |
| Total Latency | <800ms | TBD |
| Configuration | LLM Cost | Voice Cost | Total | Per Conv |
|---|---|---|---|---|
| All Claude + ElevenLabs | $126 | $159 | $285 | $0.28 |
| Smart Routing + Deepgram | $77 | $16 | $93 | $0.09 |
| Aggressive Optimization | $30 | $16 | $46 | $0.05 |
Recommendation: Smart routing balances quality and cost
- Phase 0: Research & Planning (Completed)
- Phase 1: Foundation (Weeks 1-2)
- Phase 2: LLM Integration (Weeks 2-3)
- Phase 3: Voice Processing (Weeks 3-4)
- Phase 4: GHL Integration (Weeks 4-5)
- Phase 5: MCP Integration (Weeks 5-6)
- Phase 6: Security & Testing (Weeks 6-7)
- Phase 7: Deployment (Weeks 7-8)
See: DEVELOPMENT_PLAN.md for detailed roadmap
import httpx
async with httpx.AsyncClient() as client:
response = await client.post(
"http://localhost:8000/api/v1/agents",
json={
"name": "Sales Assistant",
"description": "Handles sales inquiries",
"system_prompt": "You are a helpful sales assistant...",
"voice_config": {
"voice_id": "21m00Tcm4TlvDq8ikWAM",
"speed": 1.0,
"language": "en-US"
},
"model_preferences": {
"primary": "claude-3-5-sonnet-20241022",
"fallback": "gemini-2.5-flash"
}
},
headers={"Authorization": "Bearer YOUR_TOKEN"}
)
agent = response.json()import asyncio
import websockets
async def voice_conversation():
uri = "ws://localhost:8000/ws/conversations/new"
async with websockets.connect(uri) as websocket:
# Initialize
await websocket.send(json.dumps({
"type": "init",
"agent_id": "agent-123"
}))
# Stream audio
await websocket.send(json.dumps({
"type": "audio_chunk",
"data": base64_audio_data
}))
# Receive response
response = await websocket.recv()
print(response)
asyncio.run(voice_conversation())# API Configuration
API_PORT=8000
API_HOST=0.0.0.0
# Database
DATABASE_URL=postgresql://user:password@localhost/voice_agents
REDIS_URL=redis://localhost:6379
# LLM APIs
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=...
# Voice Services
DEEPGRAM_API_KEY=...
ELEVENLABS_API_KEY=...
# GoHighLevel
GHL_API_KEY=...
GHL_LOCATION_ID=...
GHL_WEBHOOK_SECRET=...# Run all tests
pytest
# Run with coverage
pytest --cov=backend tests/
# Run specific test file
pytest tests/test_mcp_service.py -v
# Run integration tests
pytest tests/integration/ -vWe welcome contributions! Please see CONTRIBUTING.md for guidelines.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Write tests
- Run tests and linting
- Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Documentation: docs/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Anthropic - Claude API and MCP framework
- OpenAI - GPT models
- Google - Gemini models
- GoHighLevel - CRM platform and API
- FastAPI - Web framework
- Deepgram - Voice processing
- ElevenLabs - Text-to-speech
This project is built on extensive research of 2025 best practices:
- Voice agent architecture patterns
- Multi-model AI integration
- Model Context Protocol (MCP)
- Security standards (OAuth 2.1, TLS 1.3)
- GoHighLevel API capabilities
- Performance optimization techniques
See: DEVELOPMENT_PLAN.md for complete research findings
Status: Planning Phase Complete - Ready for Implementation Version: 0.1.0 Last Updated: 2025-11-08
Built with ❤️ for the GHL community