The AI that lives on your desktop โ sees, thinks, and acts.

Glitch breaks the fourth wall of the operating system.
AI Desktop Companion (Glitch) is a fully multimodal, autonomous desktop agent that:
- ๐๏ธ Sees your screen
- ๐ค Talks with you
- ๐ค Controls your system
- ๐ฅ๏ธ Lives directly on your desktop as a playful character
- โฆand a lot more (I want you to explore ๐)
This isnโt just an assistant you use.
Itโs one you work with.
https://vimeo.com/1150677379
The Vimeo demo shows Glitch executing real tasks end-to-end.
(Also check out our landing page here!!)
Glitch runs as a transparent, click-through desktop overlay.
He shares your workspace instead of hiding in a window or sidebar.
- Interactive pixel-style characters
- Drag, click, and interact
- Customizable appearance and behavior
Inspired by classic desktop pets, powered by modern multimodal AI.
Everything is built in:
- ๐ค Voice Mode
- ๐๏ธ Vision Mode
- ๐ค Agent Mode
- โ๏ธ Settings (character & voice customization)
No switching apps. No broken context.
This is not just another chatbot.
Agent Mode lets Glitch:
- Control mouse & keyboard
- Open applications
- Execute multi-step workflows
- Do real things on your system
Thereโs always a stop button. Safety matters.
Glitch is especially useful while building.
Here, it creates a complete Next.js project structure from a single voice command โ turning ideas into runnable code instantly.
Glitch can:
- Summarize information
- Extract key points
- Save them directly to Notepad or files
Your AI remembers for you.
Ask once โ Glitch searches Google, parses results, and gives you the useful bits.
Hands-free.
Glitch isnโt robotic.
He has personality.
He reacts.
He feels present.
Working with AI finally feels alive, not transactional.
Glitch uses a hybrid multimodal agent architecture:
- ๐ง Brain โ Google Gemini 2.0 Flash (chat + vision)
- ๐๏ธ Vision โ Screen understanding via screenshots
- ๐ค Voice โ ElevenLabs (low-latency TTS)
- ๐ค Automation โ nut.js (mouse, keyboard, OS control)
- ๐ฅ๏ธ UI Soul โ Electron + PixiJS (desktop overlay)
- Node.js (v16 or higher)
- Python (v3.8 or higher) โ Required for mouse/keyboard automation
- Google Gemini API Key (Get it here)
- ElevenLabs API Key (Get it here)
-
Clone the repository
git clone https://github.com/KirthanNB/AI-Companion.git cd AI-Companion -
Install Node.js dependencies
npm install
-
Install Python dependencies (Required for Agent Mode automation)
pip install -r requirements.txt
-
Configure Environment Create a
.envfile in the root directory (copy.env.example):GOOGLE_API_KEY=your_gemini_key ELEVEN_API_KEY=your_elevenlabs_key ELEVEN_VOICE_ID=your_voice_id
-
Run the application
npm start
| Icon | Name | Description |
|---|---|---|
| ๐ค | Mic | Click to speak to the AI. |
| ๐ค | Agent Mode | Toggle autonomous mode for complex tasks. |
| ๐ | Stop | Emergency stop for any active automation. |
- "Create a portfolio website" -> Generates a project folder and opens VS Code.
- "What is on my screen?" -> Analyzes the current window content.
- "Open YouTube and search for lofi beats" -> Automates the browser.
- "Type a python script to calculate fibonacci" -> Tyupes code into your active editor.
ai-companion/
โโโ src/
โ โโโ ai/ # AI logic & GameAgent
โ โโโ services/ # Automation & helper services
โ โโโ renderer.js # Frontend logic (PixiJS)
โ โโโ main.js # Electron main process
โโโ assets/ # Images & sounds
โโโ package.json # Dependencies & scripts
To create an installer for your OS:
# Windows
npm run build:winWe welcome contributions! Please see CONTRIBUTING.md for details on how to get started.
This project is licensed under the MIT License.
Made with โค๏ธ by Kirthan NB & Rohith M









