Skip to content

An advanced, configurable OCR tool for extracting text from images with preprocessing and parallel processing capabilities. Optimized for Chinese text but supports multiple languages.

Notifications You must be signed in to change notification settings

OshekharO/Image2Text-Pro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Image2Text-Pro

Advanced OCR tool for extracting text from images with preprocessing and parallel processing.

Python OpenCV Tesseract

Features ✨

  • πŸ“· Supports multiple image formats (JPG, PNG, TIFF, BMP)
  • πŸ” Advanced image preprocessing for better OCR accuracy
  • ⚑ Parallel processing for fast batch operations
  • 🌍 Multi-language support (Chinese by default)
  • πŸ“Š Progress tracking and performance metrics
  • πŸ› οΈ Configurable preprocessing and OCR parameters

Installation πŸ› οΈ

  1. Install Tesseract OCR:

    # On Ubuntu/Debian
    sudo apt install tesseract-ocr
    sudo apt install libtesseract-dev
    
    # On macOS
    brew install tesseract
  2. Python Dependencies:

     pip install -r requirements.txt

Usage πŸš€

  1. Basic Command:

    python program.py -i input_images -o output_texts

  2. Advanced Usage:

    python program.py \ -i ./photos \ -o ./extracted_texts \ --lang eng+chi_sim \ --psm 11 \ --workers 8

Contributing 🀝

We welcome contributions! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Commit your changes (git commit -m 'Add some feature')
  4. Push to the branch (git push origin feature/your-feature)
  5. Open a Pull Request

Made with ❀️ and Python

OCR accuracy may vary depending on image quality and language complexity

About

An advanced, configurable OCR tool for extracting text from images with preprocessing and parallel processing capabilities. Optimized for Chinese text but supports multiple languages.

Topics

Resources

Stars

Watchers

Forks

Contributors 2

  •  
  •  

Languages