Skip to content

feat: add PDF Keyword Highlighter script (closes #478)#522

Open
SurfyPenguin wants to merge 3 commits intowasmerio:mainfrom
SurfyPenguin:add-pdf-highlighter
Open

feat: add PDF Keyword Highlighter script (closes #478)#522
SurfyPenguin wants to merge 3 commits intowasmerio:mainfrom
SurfyPenguin:add-pdf-highlighter

Conversation

@SurfyPenguin
Copy link

PR Title

feat: add PDF Keyword Highlighter script (closes #478 )

Summary

Added a new command-line Python script that highlights specified keywords in PDF files using PyMuPDF, complete with a dedicated folder, README, and entry in the main repository README.

Description

This pull request implements a fully featured PDF keyword highlighter as requested in issue #478, creating a new highlighted output file while keeping the original unchanged.

The changes are as follows:

  • Created new folder PDF Highlighter Script/ with pdf_highlight.py and a README.md
  • Implemented efficient keyword highlighting using page.get_text("words") for fast text extraction
  • Supported multiple keywords, optional case-sensitive search (-s flag), and punctuation stripping for accurate matching (e.g., "keyword;" matches "keyword")
  • Printed per-page and total highlight statistics in a formatted table
  • Updated root README.md to add the new script entry in alphabetical order

Checks

in the repository

  • Made no changes that degrades the functioning of the repository
  • Gave each commit a better title (unlike updated README.md)

in the PR

  • Followed the format of the pull_request_template
  • Made the Pull Request in a small level (for the creator's wellfare)
  • Tested the changes you made

Thank You,

Amartya Anand

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant