🖐️🎤 Micdrop: Real-Time Voice Conversations with AI

Micdrop is a set of open source Typescript packages to build real-time voice conversations with AI agents. It handles all the complexities on the browser and server side (microphone, speaker, VAD, network communication, etc) and provides ready-to-use implementations for various AI providers.

📦 Packages

Core Packages (start here)

@micdrop/client - Browser library handling microphone input, audio playback, and real-time communication
@micdrop/server - Server implementation for audio streaming and AI integration orchestration

AI Implementations

@micdrop/openai - OpenAI integration providing LLM agent and speech-to-text capabilities
@micdrop/ai-sdk - AI SDK agent compatible with a lot of LLM providers.
@micdrop/elevenlabs - ElevenLabs text-to-speech integration with streaming support
@micdrop/cartesia - Cartesia text-to-speech integration for real-time voice synthesis
@micdrop/mistral - Mistral AI agent integration for conversation handling
@micdrop/gladia - Gladia speech-to-text integration for audio transcription

Utility Packages

@micdrop/react - React hooks for Micdrop

Demo Applications

demo-client - Example web application with React.
demo-server - Example server with fastify.

🎥 Demo and technical details (video)

See the author Godefroy de Compreignac talking about Micdrop and voice AI in this video:

🤔 Why Micdrop?

While real-time multimodal models (voice-to-voice) offer impressive capabilities, they often come with limitations in terms of customization and cost. Micdrop takes a different approach by:

🎯 Allowing you to choose the best-in-class API for each component:
- Select specific voices from TTS providers
- Use different LLMs optimized for your use case
- Pick STT engines suited for specific languages/accents
💰 Reducing costs by letting you:
- Use more cost-effective API providers
- Mix open source and commercial solutions
- Control exactly when APIs are called
🔧 Providing granular control over the conversation flow
🌐 Supporting a wider range of languages and voices through specialized providers

This modular approach gives you the flexibility to build voice applications that are both powerful and cost-effective.

🌟 Features

🎙️ Microphone handling with:
- Streaming support
- Voice Activity Detection (VAD)
🔊 Advanced audio playback with:
- Streaming support
- Device selection and control
🌐 WebSocket communication
📦 AI implementations provided for OpenAI, ElevenLabs, Mistral, Gladia, and more
🔌 Bring your own AI components (framework agnostic)
- Large Language Models (LLM)
- Text-to-Speech (TTS)
- Speech-to-Text (STT)

🧪 Development

For detailed development instructions, including how to build, test, and publish packages, please see DEVELOPMENT.md.

📄 License

MIT License - see the LICENSE file for details

Author

Originally developed for Raconte.ai and open sourced by Lonestone (GitHub)

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
doc		doc
examples		examples
packages		packages
.gitignore		.gitignore
.prettierrc		.prettierrc
CLAUDE.md		CLAUDE.md
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🖐️🎤 Micdrop: Real-Time Voice Conversations with AI

📦 Packages

Core Packages (start here)

AI Implementations

Utility Packages

Demo Applications

🎥 Demo and technical details (video)

🤔 Why Micdrop?

🌟 Features

🧪 Development

📄 License

Author

About

Uh oh!

Releases

Packages

Languages

License

dimetron/micdrop

Folders and files

Latest commit

History

Repository files navigation

🖐️🎤 Micdrop: Real-Time Voice Conversations with AI

📦 Packages

Core Packages (start here)

AI Implementations

Utility Packages

Demo Applications

🎥 Demo and technical details (video)

🤔 Why Micdrop?

🌟 Features

🧪 Development

📄 License

Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages