Conversation
src/markitdown/_markitdown.py
Outdated
| # Convert content | ||
| content_md = [] | ||
| h = html2text.HTML2Text() | ||
| h.body_width = 0 # Don't wrap lines |
There was a problem hiding this comment.
hi, could you check if this can use existing HtmlConverter
markitdown/src/markitdown/_markitdown.py
Lines 183 to 223 in cb66b35
|
@0xRaduan love this PR. We already have a dependency for HTML to text (markdownify) in the HTML convertor. Can you check if that would be sufficient? |
|
Hey @gagb, sorry was on a long vacation, going to take a look right now... |
|
Thanks for this, really looking forward to it! |
|
Can we have an update on this? Thanks |
packages/markitdown/src/markitdown/converters/_epub_converter.py
Outdated
Show resolved
Hide resolved
|
|
||
| return "" % (alt, src, title_part) | ||
|
|
||
| def convert_em(self, el: Any, text: str, convert_as_inline: bool) -> str: |
There was a problem hiding this comment.
noticed that it doesn't have an tag, and that's used in Epub as far as I know
|
cc. @gagb - do you think we can merge this? i resolved all the merge conflicts as far as i can see |
|
or also cc. @afourney, since I see you've been merging the latest PRs into main |
|
@gagb - does this still await my response? any timeline for getting this merged? |
|
@0xRaduan Apologies for the delay. We're a super small team, with several large projects (e.g., AutoGen). I'll work on getting this in, and conflicts resolved, this weekend. |
|
Ok, on second glance, EbookLib is AGPL -- which is very strong copyleft. I'm not clear we can include it here. I can look for an alternative, or I can help you set it up as a 3rd party plugin that you can host. LMK |
* Adapted #123 to not use epublib. * Updated README.md
|
Closed in #1131 |
|
Thanks, closing this PR. |
|
mad768063
left a comment
There was a problem hiding this comment.
Symferopolskaya 2L, Днепр, 49005, Украина
| "charset-normalizer", | ||
| "openai", | ||
| "ebooklib", | ||
| "azure-ai-documentintelligence", |
Addresses #88.
Adds new converter + new test.