Skip to content

Conversation

@jmooring
Copy link
Member

Fixes #13401

@jmooring
Copy link
Member Author

jmooring commented Aug 11, 2025

This PR sanitizes TOC heading titles by only allowing inline HTML elements, excluding anchor elements. That means that this Markdown:

## Some _emphasis_ and a [link](foo)

## A <div>div</div> and a <p>paragraph</p>

Produces this TOC:

<nav id="TableOfContents">
  <ul>
    <li><a href="#some-emphasis-and-a-link">Some <em>emphasis</em> and a link</a></li>
    <li><a href="#a-div-and-a-paragraph">A div and a paragraph</a></li>
  </ul>
</nav>

Instead of this:

<nav id="TableOfContents">
  <ul>
    <li><a href="#some-emphasis-and-a-link">Some <em>emphasis</em> and a <a href="foo">link</a></a></li>
    <li><a href="#a-div-and-a-paragraph">A <div>div</div> and a <p>paragraph</p></a></li>
  </ul>
</nav>

In the above, the first item contains a link within a link (invalid HTML), while the second item is nonsensical as a TOC entry.

This adds another dependency, bluemonday. Note that Filippo Valsorda is now one of its maintainers.

Performance difference:

  • Heading titles without HTML: none
  • Heading titles with HTML: ~15% percent slower

Given that most (95%?) headings in the wild do not include HTML, the performance change is, in my view, acceptable.

@bep
Copy link
Member

bep commented Aug 12, 2025

Given that most (95%?) headings in the wild do not include HTML, the performance change is, in my view, acceptable.

I'm not worried about performance for this, but ... How many Markdown headings in the wild contains links? As in: Does this solve a real issue?

@bep bep merged commit 5fdcc09 into gohugoio:master Aug 13, 2025
6 checks passed
@jmooring jmooring deleted the sanitize-toc-heading-title-13401 branch August 13, 2025 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants