A powerful voice-controlled productivity tool that combines speech recognition with AI assistance and keyboard automation. This tool allows users to control their computer, generate code, and interact with AI using voice commands triggered by keyboard shortcuts.
- ποΈ Local transcription using MLX Whisper
- π€ AI-powered command execution and responses using OpenRouter (Claude)
- π» Code generation through voice commands
- β¨οΈ Keyboard shortcut automation
- π Direct text input from voice
- π Smart context awareness using clipboard
- CTRL + SHIFT (Left): Execute voice commands for keyboard shortcuts
- CTRL + CMD (Right): Transcribe voice to text
- SHIFT + ALT (Left): Get AI assistance with context awareness
- CTRL + ALT + CMD (Left): Generate code from voice input with context support
openai
pynput
sounddevice
mlx_whisper
pydantic
pyperclip
numpy
python-dotenvThe following environment variables need to be set:
OPENROUTER_API_KEY=your_openrouter_api_key- Clone the repository:
git clone [repository-url]- Install dependencies:
uv pip install -r requirements.txt- Set up environment variables as described above.
- Run the main script:
uv run handy.py- Use keyboard shortcuts to activate different modes:
- Hold the designated key combination
- Speak your command
- Release the keys to process the command
-
Code Generation:
- Hold
CTRL + ALT + CMD(Left) - Say "create a Python function to sort a list"
- Release keys to get the generated code
- Hold
-
AI Assistance:
- Hold
SHIFT + ALT(Left) - Select text for context (optional)
- Ask your question
- Release to get AI response
- Hold
-
Voice Transcription:
- Hold
CTRL + CMD(Right) - Speak your text
- Release to transcribe
- Hold
-
Keyboard Commands:
- Hold
CTRL + SHIFT(Left) - Say "copy" or "paste" or other keyboard shortcuts
- Release to execute the command
- Hold
AudioRecorder: Handles real-time audio recording and processingKeyboardShortcut: Manages keyboard combinations and actionsContextManager: Handles clipboard-based context awareness- AI Integration: Uses OpenRouter with Claude for intelligent responses
- MLX Whisper: Provides fast and accurate speech-to-text conversion
Contributions are welcome! Please feel free to submit a Pull Request.
- MLX Whisper for speech recognition
- OpenRouter and Claude for AI capabilities
- The open-source community for various dependencies