🎬 Educational Video Pipeline

An AI-powered system that automatically generates animated educational videos from text prompts using Google ADK agents, Manim animations, and ElevenLabs narration.

Status: Private Repository | Proprietary Project

📋 Table of Contents

🏗️ System Architecture

The pipeline consists of three main agent groups working in sequence:

1. Coordinator Agent (Main Orchestrator)

Manages the entire pipeline flow
Coordinates between transcript and video generation
Handles state management across all sub-agents
Will manage watermark agent in future updates

2. Transcript Generation Pipeline (7 Sequential Agents)

Topic Researcher → Content Structurer → Script Writer → Speech Formatter 
    ↓
Audio Transcriber ← ElevenLabs TTS
    ↓
Scene Summary

3. Video Generation Pipeline (3 Parallel Agents)

Video Agent → [Manim Code Gen, Render Checker, Scene Validator]
    ↓
Concatenator (combines all scenes + audio + music)
    ↓
[Future: Watermark Agent → Final branded video]

📦 Prerequisites

Python 3.8+ installed and in PATH
FFmpeg for video/audio processing
Manim Community Edition for animations
API Keys for:
- Google ADK/Gemini
- ElevenLabs (TTS)
- Groq (STT)

🚀 Installation

Step 1: Clone the Repository

git clone <your-repo-url>
cd educational-video-pipeline

Step 2: Create Virtual Environment

# Windows
python -m venv venv
venv\Scripts\activate

# Mac/Linux
python3 -m venv venv
source venv/bin/activate

Step 3: Install Python Dependencies

pip install -r requirements.txt

Step 4: Install External Dependencies

FFmpeg

Windows: Download from ffmpeg.org, extract, and add to PATH
Mac: brew install ffmpeg
Linux: sudo apt install ffmpeg (Ubuntu/Debian) or sudo yum install ffmpeg (Fedora)

Manim

# Should be installed via requirements.txt, but if issues arise:
pip install manim

Step 5: Verify Installations

# Check FFmpeg
ffmpeg -version

# Check Manim
manim --version

# Check Python packages
python -c "import elevenlabs, groq, manim; print('All packages installed!')"

⚙️ Configuration

1. Create Environment File

Create a .env file in the project root:

# API Keys
ELEVENLABS_API_KEY=your_elevenlabs_key_here
GROQ_API_KEY=your_groq_key_here
GOOGLE_API_KEY=your_google_key_here  # If using Google services

2. Update Manim Path

Edit join.py and update the Manim executable path:

# Line ~41 in join.py
MANIM_EXECUTABLE = r"C:\path\to\your\manim.exe"  # Windows
# or
MANIM_EXECUTABLE = "/usr/local/bin/manim"  # Mac/Linux

3. Prepare Background Music

Place your background music file at:

C:\Users\leg\Documents\elearn\agent\music\background.mp3

Or update the path in the concatenate agent instruction.

🎯 Running the Agent

1. Start the Main Application

python join.py

2. Input Your Request

When prompted, paste your educational video request. Format:

Create a [duration]-minute [level] educational video about [topic].
Include [specific requirements].

Example:
Create a 3-minute beginner-level educational video about photosynthesis.
Include colorful animations and simple analogies for middle school students.

3. Wait for Processing

The pipeline will:

Research the topic (30-60 seconds)
Generate script and narration (1-2 minutes)
Create and render animations (2-5 minutes per scene)
Concatenate final video (30 seconds)

Total time: 5-15 minutes depending on video length

4. Find Your Video

The completed video will be in:

final_videos/educational_video_[timestamp].mp4

🔄 How It Works

Phase 1: Content Generation

Topic Researcher - Searches for comprehensive information
Content Structurer - Creates timed educational outline
Script Writer - Converts outline to conversational script
Speech Formatter - Adds pauses and timing for TTS
ElevenLabs TTS - Generates professional narration
Audio Transcriber - Creates timestamped transcription
Scene Summary - Breaks content into animated scenes

Phase 2: Video Generation

Video Agent coordinates the creation of each scene:
- Manim Code Gen - Generates animation code
- Render Checker - Executes and monitors rendering
- Scene Validator - Ensures timing matches narration

Phase 3: Final Assembly

Concatenator - Combines all scenes with audio and background music
[Future] Watermark Agent - Adds branding and outro

📁 Project Structure

educational-video-pipeline/
├── join.py              # Main application orchestrator
├── prompt.py            # Agent instructions and prompts
├── stt_tools.py         # Speech-to-text functionality
├── tts_tool.py          # Text-to-speech functionality
├── .env                 # API keys (create from .env.example)
├── requirements.txt     # Python dependencies
├── generated_audio/     # TTS output files
├── transcriptions/      # STT transcriptions
├── media/              # Manim render files
├── final_videos/       # Completed videos
└── music/
    └── background.mp3   # Background music

🔧 Troubleshooting

Common Issues

1. "Manim not found"

Verify Manim installation: manim --version
Update MANIM_EXECUTABLE path in join.py
Try full path: C:\Python39\Scripts\manim.exe

2. "API Error" or Rate Limits

Check .env file has valid API keys
Wait 30 seconds and retry
The system has automatic retry logic (3 attempts)

3. "FFmpeg not found"

Ensure FFmpeg is in system PATH
Test with: ffmpeg -version
Restart terminal after PATH updates

4. Duration Mismatch

System automatically adjusts timing ±0.5 seconds
For larger mismatches, scenes are regenerated
Maximum 30 attempts per scene

5. Import Errors

# Reinstall all dependencies
pip install --upgrade -r requirements.txt

# For specific package issues
pip uninstall [package_name]
pip install [package_name]

Debug Mode

To see detailed logs, run with:

python join.py > debug.log 2>&1

🎨 Customization

Voice Options

Edit tts_tool.py to change voices:

voice_id = "21m00Tcm4TlvDq8ikWAM"  # Rachel (default)
# Other options:
# "EXAVITQu4vr4xnSDxMaL" - Bella
# "ErXwobaYiN019PkySvjV" - Antoni

Animation Styles

Modify prompts in prompt.py to change animation preferences

Background Music

Volume: Adjust music_volume in concatenate agent (default: 0.15)
File: Update path in concatenate agent instruction

🗺️ Roadmap

Upcoming Features

Phase 1: Enhanced TTS Options (Q1 2025)

OpenAI TTS Integration - Add support for OpenAI's text-to-speech models
Google Cloud TTS - Integrate Google's WaveNet voices
Amazon Polly - Add AWS Polly for more voice variety
Voice Cloning - Support for custom voice cloning with ElevenLabs
Multi-language Support - Enable video generation in multiple languages

Phase 2: Watermark Agent (Q2 2025)

Watermark Agent - New automated branding agent
```
Concatenator → Watermark Agent → Final Output
```
Features:
- Auto-detect video dimensions and add appropriate watermark
- Support for image (PNG/SVG) and video watermarks
- Configurable position (corner/center/custom)
- Fade in/out animations
- Duration control (full video or last X seconds)
Example workflow:
```
# The agent will automatically:
1. Take the concatenated video
2. Apply watermark based on config
3. Add outro video if specified
4. Output final branded video
```
Outro Integration
- Pre-made outro videos with smooth transitions
- Dynamic text overlay (channel name, social links)
- Subscribe button animations
- End screen templates matching video style

Phase 3: Advanced Features (Q3-Q4 2025)

Quality Presets - Platform-specific optimization
- YouTube (1080p/4K horizontal)
- TikTok/Shorts (9:16 vertical)
- Instagram Reels (9:16 with safe zones)
- Twitter/X video specifications
Interactive Elements - Clickable areas in videos
Multiple Video Formats - Support for vertical videos (Shorts/Reels)
Batch Processing - Generate multiple videos from CSV input
Cloud Deployment - Deploy pipeline to cloud services
Web Interface - Browser-based UI for easier access

Current Development

🔨 Working on: Watermark Agent implementation for automatic branding

📞 Support

For issues or questions:

Check the Troubleshooting section
Review error messages in the console
Ensure all prerequisites are properly installed

🔒 Contributing

This is a private repository. Contributions are limited to authorized team members only.

📄 License

Built with ❤️ using Google ADK, Manim, ElevenLabs, and Groq

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
join.py		join.py
prompt.py		prompt.py
requirements.txt		requirements.txt
stt_tools.py		stt_tools.py
tts_tool.py		tts_tool.py

twelve2five/edupipe

Folders and files

Latest commit

History

Repository files navigation