Skip to content

Feat epub inline images#1537

Open
tosmart01 wants to merge 3 commits intomicrosoft:mainfrom
tosmart01:feat-epub-inline-images
Open

Feat epub inline images#1537
tosmart01 wants to merge 3 commits intomicrosoft:mainfrom
tosmart01:feat-epub-inline-images

Conversation

@tosmart01
Copy link

@tosmart01 tosmart01 commented Jan 15, 2026

Background

Add EPUB image handling so embedded images are preserved during conversion.

Changes

  • Build a manifest path -> media-type map for MIME resolution.
  • Inline local EPUB images as data URIs when keep_data_uris=true.
  • Resolve relative/decoded paths and fall back to mimetypes when needed.
  • Add an EPUB image conversion test using a sample file.

Files

  • packages/markitdown/src/markitdown/converters/_epub_converter.py
  • packages/markitdown/tests/test_epub_images.py
  • packages/markitdown/tests/test_files/test_epub_images.epub

Testing

  • packages/markitdown/tests/test_epub_images.py
@tosmart01
Copy link
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant