Speecher is an edge-native voice-to-text transcription application designed to facilitate communication between speaking individuals and those who are deaf or hard of hearing. The app runs entirely on-device using local Whisper models for real-time speech recognition with zero cloud dependency.
- β 100% Offline & Private - All processing happens on-device, no internet connection required
- π€ Real-time Edge Inference - Convert speech to text instantly using local models
- π΅πΉ Portuguese Language Support - Built for Portuguese speech recognition, other languages available through the Whisper CPP Model
- π± Clean, Accessible Interface - Large text display for easy reading
- π Simple Recording Controls - One-tap recording with visual feedback
- β‘ Metal GPU Acceleration - Optimized edge performance on iOS devices
- π Privacy-First - No data leaves your device, complete edge computing solution
The app leverages edge computing principles with:
- SwiftUI - Modern iOS user interface
- Whisper.cpp - Local edge speech recognition engine - the small size model works offers a good balance of size (500MB) with speed (tested in iPhone PRO MAX 14)
- AVFoundation - On-device audio recording and processing
- Metal - GPU acceleration for edge model inference
- Local Storage - Edge model deployment with no external dependencies
- iOS 16.4+ (iPhone 13+) / macOS 13.3+
- Xcode 15+
- ~500MB edge storage space for the small Whisper model
- Microphone permissions
git clone --recursive https://github.com/lazaroborges/speecher.git
cd speecherFirst, build the Whisper.cpp XCFramework for edge deployment on iOS:
cd whisper.cpp
./build-xcframework.shThis will create the necessary framework files optimized for edge computing.
Download the Portuguese-optimized model for edge inference in the desired size:
cd whisper.cpp/models
./download-ggml-model.sh <model_size>This will download ggml-<model_size>.bin (~1.4GB) for local edge deployment.
- Open
speecher.xcodeprojin Xcode - Right-click on the
Resources/models/folder in the project navigator - Select "Add Files to 'speecher'"
- Navigate to
whisper.cpp/models/and selectggml-<model_size>.bin - Ensure "Add to target: speecher" is checked
- Click "Add"
- In Xcode, select your project in the navigator
- Go to your target's "General" tab
- Under "Frameworks, Libraries, and Embedded Content", click "+"
- Click "Add Other..." β "Add Files..."
- Navigate to
whisper.cpp/and add the generatedwhisper.xcframework - Set "Embed & Sign" for the framework
The app requires microphone access for edge processing. The Info.plist should include:
<key>NSMicrophoneUsageDescription</key>
<string>Speecher needs microphone access to transcribe your speech to text using edge computing.</string>- Launch the app - The Whisper model will load automatically
- Tap the blue microphone button - Start recording
- Speak clearly - The recording indicator will show red
- Tap the red stop button - End recording and process speech
- View transcription - Large text appears in full-screen mode
- Record again or clear - Use the control buttons to continue
You can deploy different Whisper models based on your edge computing needs:
| Model | Size | Edge Speed | Accuracy | Edge Use Case |
|---|---|---|---|---|
tiny |
~39MB | Fastest | Basic | Edge testing/Development |
base |
~142MB | Fast | Good | Quick edge transcription |
small |
~466MB | Medium | Better | Balanced edge performance |
medium |
~1.4GB | Slower | Best | Production edge use (default) |
large-v3 |
~2.9GB | Slowest | Excellent | Maximum edge accuracy |
To deploy a different edge model:
# Download the desired edge model
cd whisper.cpp/models
./download-ggml-model.sh small
# Update WhisperState.swift to use the new edge model
# Change "ggml-<model_size>.bin" to "ggml-small.bin" in modelUrlspeecher/
βββ ContentView.swift # Main UI with edge recording interface
βββ WhisperState.swift # Edge speech recognition state management
βββ LibWhisper.swift # Whisper.cpp at the edge integration layer
βββ Recorder.swift # Edge audio recording functionality
βββ RiffWaveUtils.swift # WAV file processing for edge inference
βββ speecherApp.swift # Edge app entry point
βββ Resources/
βββ models/ # Edge model storage
βββ ggml-<model_size>.bin # Deployed edge model file
The app is configured for Portuguese edge processing by default. To change the language for edge inference, modify LibWhisper.swift:
// Line 35 in LibWhisper.swift
params.language = pt // Change "pt" to your language code for edge processingSupported model language codes: en, es, fr, de, it, pt, ru, zh, etc.
Adjust threading for optimal edge computing performance in LibWhisper.swift:
// Line 25 - Modify thread count based on edge device capabilities
let maxThreads = max(1, min(8, cpuCount() - 2))- Verify
ggml-<model_size>.binis properly deployed in the app bundle - Check edge model file size (~500MB for small model)
- Ensure edge model is added to Xcode target
- Speak clearly and close to device during edge processing
- Minimize background noise for better edge inference
- Consider deploying a larger edge model (
large-v3) - Check microphone permissions for edge recording
- Deploy a smaller edge model (
smallorbase) - Close other apps to free memory for edge computing
- Ensure device has sufficient storage for edge models
- Update Xcode to latest version for edge framework support
- Clean build folder (β+Shift+K)
- Verify whisper.xcframework is properly linked for edge deployment
- Fork the repository
- Create a feature branch for edge improvements
- Make your edge computing changes
- Test thoroughly on edge devices
- Submit a pull request
This edge computing project uses the MIT License. The Whisper.cpp library is also under MIT License.
- OpenAI Whisper - Original speech recognition model
- whisper.cpp - Efficient C++ implementation for edge computing
- ggerganov - whisper.cpp author and edge computing pioneer
For edge computing issues and questions:
- Check the edge troubleshooting section above
- Search existing GitHub issues for edge-related problems
- Create a new issue with detailed edge deployment information
Made with β€οΈ in Brazil for the deaf and hard of hearing community using computing