Skip to content
forked from lonestone/micdrop

Micdrop is a set packages for node and browser that simplify voice conversations with AI systems.

License

Notifications You must be signed in to change notification settings

dimetron/micdrop

 
 

Repository files navigation

πŸ–οΈπŸŽ€ Micdrop: Real-Time Voice Conversations with AI

Micdrop website | Documentation

Micdrop is a set of open source Typescript packages to build real-time voice conversations with AI agents. It handles all the complexities on the browser and server side (microphone, speaker, VAD, network communication, etc) and provides ready-to-use implementations for various AI providers.

πŸ“¦ Packages

Core Packages (start here)

  • @micdrop/client - Browser library handling microphone input, audio playback, and real-time communication
  • @micdrop/server - Server implementation for audio streaming and AI integration orchestration

AI Implementations

Utility Packages

Demo Applications

πŸŽ₯ Demo and technical details (video)

See the author Godefroy de Compreignac talking about Micdrop and voice AI in this video:

Youtube video

πŸ€” Why Micdrop?

While real-time multimodal models (voice-to-voice) offer impressive capabilities, they often come with limitations in terms of customization and cost. Micdrop takes a different approach by:

  • 🎯 Allowing you to choose the best-in-class API for each component:
    • Select specific voices from TTS providers
    • Use different LLMs optimized for your use case
    • Pick STT engines suited for specific languages/accents
  • πŸ’° Reducing costs by letting you:
    • Use more cost-effective API providers
    • Mix open source and commercial solutions
    • Control exactly when APIs are called
  • πŸ”§ Providing granular control over the conversation flow
  • 🌐 Supporting a wider range of languages and voices through specialized providers

This modular approach gives you the flexibility to build voice applications that are both powerful and cost-effective.

🌟 Features

  • πŸŽ™οΈ Microphone handling with:
    • Streaming support
    • Voice Activity Detection (VAD)
  • πŸ”Š Advanced audio playback with:
    • Streaming support
    • Device selection and control
  • 🌐 WebSocket communication
  • πŸ“¦ AI implementations provided for OpenAI, ElevenLabs, Mistral, Gladia, and more
  • πŸ”Œ Bring your own AI components (framework agnostic)
    • Large Language Models (LLM)
    • Text-to-Speech (TTS)
    • Speech-to-Text (STT)

πŸ§ͺ Development

For detailed development instructions, including how to build, test, and publish packages, please see DEVELOPMENT.md.

πŸ“„ License

MIT License - see the LICENSE file for details

Author

Originally developed for Raconte.ai and open sourced by Lonestone (GitHub)

About

Micdrop is a set packages for node and browser that simplify voice conversations with AI systems.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 97.6%
  • JavaScript 2.4%