Prism

Transform academic papers into adorable manga with AI

Prism is an open-source tool that converts complex academic papers (PDFs) into engaging, easy-to-understand manga-style comics. Powered by Gemini's image generation capabilities, it makes learning fun and accessible for everyone.

Features

PDF to Manga Conversion - Upload any academic paper and get a beautifully illustrated manga
Multiple Art Styles - Choose from 3 unique manga themes:
- Kumomo - Original cute characters (kumo, nezu, papi) with consistent design across all panels
- Chiikawa - Cute, simple characters with soft pastel colors (Nagano style)
- Studio Ghibli - Dreamy watercolor atmosphere (Spirited Away style)
Character Consistency - Reference images ensure characters look the same throughout the entire manga
AI-Powered Storyboarding - Intelligent breakdown of complex concepts into visual panels with proper story ordering
Multi-Language Support - Generate manga in English, Chinese (中文), or Japanese (日本語)
CJK Text Optimization - Dynamic batch sizing for clear, readable Chinese/Japanese text
Real-time Generation Progress - Watch your manga come to life panel by panel

Example Output

How It Works

┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Upload PDF │ ──▶ │   Analyze   │ ──▶ │ Storyboard  │ ──▶ │  Generate   │
│             │     │  (English)  │     │  + Translate│     │ Manga Panels│
└─────────────┘     └─────────────┘     └─────────────┘     └─────────────┘

Upload - Drop your PDF academic paper
Analyze - AI extracts and understands the content in English (for accuracy)
Storyboard - Creates a visual narrative, then translates dialogue to target language
Generate - Gemini renders each manga panel with consistent character designs

Three-Step Translation Pipeline

For optimal quality, Prism uses a three-step process:

English Analysis - Technical content is analyzed in English for accuracy
English Storyboard - Panels and dialogue are created in English first
Translation - Final dialogue is translated to the target language (zh-CN, ja-JP)

This ensures technical terms are understood correctly before translation.

Quick Start

Prerequisites

Python 3.11+
Node.js 18+
OpenRouter API key (for Gemini access)

Installation

# Clone the repository
git clone https://github.com/raucvr/Prism.git
cd Prism

# Backend setup
cd backend
pip install -r requirements.txt

# Frontend setup
cd ../frontend
npm install

Configuration

Copy the example config file:

cp config/api_config.yaml.example config/api_config.yaml

Set your OpenRouter API key in config/api_config.yaml:

api_key: "your-openrouter-api-key"

Or use environment variable:

# Windows
set OPENROUTER_API_KEY=your-openrouter-api-key

# Linux/macOS
export OPENROUTER_API_KEY=your-openrouter-api-key

Running

# Terminal 1 - Start backend
cd backend
python -m uvicorn main:app --host 0.0.0.0 --port 8000 --reload

# Terminal 2 - Start frontend
cd frontend
npm run dev

Open http://localhost:3000 in your browser.

Project Structure

prism/
├── backend/
│   ├── engines/           # AI engine implementations
│   │   ├── base.py        # Abstract base classes
│   │   └── openrouter.py  # OpenRouter API wrapper
│   ├── services/
│   │   ├── pdf_parser.py      # PDF text extraction
│   │   ├── storyboarder.py    # Storyboard generation
│   │   └── manga_generator.py # Image generation with character consistency
│   ├── routes/            # API endpoints
│   └── main.py            # FastAPI application
├── frontend/
│   ├── src/
│   │   ├── app/           # Next.js app router
│   │   ├── components/    # React components
│   │   └── store/         # Zustand state management
│   └── package.json
├── config/
│   ├── api_config.yaml.example  # Config template
│   └── character_images/        # Reference images for Kumomo theme
│       ├── kumo.jpeg
│       ├── nezu.jpeg
│       └── papi.jpeg
└── README.md

Technical Details

Character Consistency

For the Kumomo theme, Prism ensures character consistency by:

Reference Images - Loading character design images for each API call
Explicit Mapping - Telling the model exactly which image corresponds to which character
Low Temperature - Using temperature=0.3 to reduce randomness
Negative Prompts - Explicitly forbidding character design deviations

Dynamic Batch Sizing

For CJK languages (Chinese, Japanese), Prism dynamically adjusts batch size:

Long dialogue (>200 chars): 1 panel per batch
Medium dialogue (>100 chars): 2 panels per batch
Short dialogue: 4 panels per batch (2x2 grid)

This ensures text remains readable even with complex characters.

Story Ordering

All panels are sorted by panel_number after generation to ensure the story flows correctly, even if the AI generates them out of order.

API Reference

Endpoints

Method	Endpoint	Description
`GET`	`/health`	Health check
`GET`	`/api/config`	Get current configuration
`POST`	`/api/manga/from-pdf`	Full pipeline: PDF to manga
`POST`	`/api/manga/storyboard`	Generate storyboard from PDF
`GET`	`/api/manga/progress`	Get generation progress

Example: Generate Manga from PDF

curl -X POST http://localhost:8000/api/manga/from-pdf \
  -F "file=@paper.pdf" \
  -F "theme=kumomo" \
  -F "language=zh-CN"

Configuration Options

Manga Settings

Option	Default	Description
`default_style`	`full_color_manga`	Art style
`default_theme`	`kumomo`	Default manga theme
`temperature`	`0.3`	Generation randomness (lower = more consistent)

Output Settings

Option	Default	Description
`image_format`	`png`	Output image format
`image_quality`	`95`	PNG quality
`max_width`	`1024`	Maximum image width
`max_height`	`1536`	Maximum image height

Supported Models

Provider	Model	Description
OpenRouter	`google/gemini-2.0-flash-exp-image-generation`	Recommended - Gemini image generation

Tech Stack

Backend

FastAPI - High-performance Python web framework
httpx - Async HTTP client
PyMuPDF - PDF parsing
Pillow - Image processing

Frontend

Next.js 14 - React framework with App Router
TailwindCSS - Utility-first CSS
Zustand - State management
Framer Motion - Animations
react-dropzone - File upload

Contributing

We welcome contributions!

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Roadmap

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

OpenRouter - API gateway for AI models
Chiikawa - Inspiration for cute art style

Made with ❤ by the Prism Team

Report Bug · Request Feature

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
backend		backend
config		config
docs		docs
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
start.bat		start.bat
start.ps1		start.ps1
stop.bat		stop.bat
stop.ps1		stop.ps1

License

raucvr/Prism_manga-reader

Folders and files

Latest commit

History

Repository files navigation

Prism

Features

Example Output

How It Works

Three-Step Translation Pipeline

Quick Start

Prerequisites

Installation

Configuration

Running

Project Structure

Technical Details

Character Consistency

Dynamic Batch Sizing

Story Ordering

API Reference

Endpoints

Example: Generate Manga from PDF

Configuration Options

Manga Settings

Output Settings

Supported Models

Tech Stack

Backend

Frontend

Contributing

Roadmap

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages