Skip to content

Add automated documentation link validation#4949

Open
Nokhrin wants to merge 1 commit into
antlr:devfrom
Nokhrin:experiment/docs-validation-20260614
Open

Add automated documentation link validation#4949
Nokhrin wants to merge 1 commit into
antlr:devfrom
Nokhrin:experiment/docs-validation-20260614

Conversation

@Nokhrin

@Nokhrin Nokhrin commented Jun 14, 2026

Copy link
Copy Markdown

Summary

This PR proposes adding automated documentation link validation to the ANTLR4 repository using docs-validator, a Python-based static analyzer that detects broken links, orphan files, missing anchors, and circular dependencies in documentation repositories.

Compliance with ANTLR contribution guidelines

Branch: This PR is from experiment/docs-validation-20260614, derived from dev
DCO: All commits are signed with git commit -s (Developer Certificate of Origin)
Scope: Only adds .github/workflows/docs-validation.yml - no changes to source code or existing documentation

Experiment Results

I ran docs-validator against the ANTLR4 documentation to demonstrate its value. The validation completed in ~35 seconds and analyzed 63 documentation files containing 294 links (95 internal, 196 external).

Metrics

Metric Value
Files scanned 63
Total links 294
Internal links 95
External links 196
Total issues found 74
Broken internal links 1
Broken external links 25
Orphan files 48
Links requiring manual verification 38

Real Issues Discovered

1. Broken Internal Links (1 error)

  • doc/ace-javascript-target.md:137resources/worker-base.js (file not found)

2. Broken External Links (25 errors)

  • README.md:13https://github.com/antlr/antlr4/actions/workflows/windows.yml/badge.svg?branch=dev (workflow doesn't exist)
  • doc/java-target.md:19https://marketplace.eclipse.org/content/antlr-ide (404)
  • doc/releasing-antlr.md:159,191,251 → Multiple oss.sonatype.org links (404)
  • doc/resources.md:12http://leonotepad.blogspot.com.br/... (404)
  • doc/swift-target.md:5 → Apple Swift Package Manager docs (anchor not found)
  • doc/target-agnostic-grammars.md:24grammars-v4 repository (anchor not found)

3. Orphan Files (48 warnings)
Files without incoming links, potentially undiscoverable:

  • ANTLR-HOUSE-RULES.md
  • doc/ace-javascript-target.md, doc/actions.md, doc/cpp-target.md, etc.
  • runtime/Cpp/cmake/Antlr4Package.md
  • runtime/Go/antlr/README.adoc

4. Manual Verification Required (38 warnings)
Resources blocked by WAF or rate-limited:

  • Multiple pragprog.com links (403)
  • npmjs.com links (403)
  • oss.sonatype.org links (403)

Benefits for ANTLR4

  1. Prevent link rot: Automatically detect broken links before they accumulate (already found 26 broken links)
  2. Improve discoverability: Identify 48 orphan files that may need better navigation
  3. Maintain quality: Ensure all anchors and cross-references remain valid
  4. Low maintenance: Once configured, validation runs automatically on PRs
  5. Non-blocking: Warnings don't block merges, only errors do
  6. Fast: Validation completes in ~35 seconds

What This PR Adds

This PR adds a single file: .github/workflows/docs-validation.yml

The workflow:

  • Runs on pushes to dev and master branches
  • Runs on pull requests that modify .md, .markdown, .asc, or .adoc files
  • Can be triggered manually via workflow_dispatch
  • Generates a Markdown report as an artifact
  • Exits with code 1 if errors are found (configurable)

Configuration

The workflow uses default settings. If maintainers want to customize behavior, they can add a .docs-validator.toml file to the repository root with options like:

[validator]
exclude_patterns = [".git", "node_modules"]
hosts_to_ignore = ["npmjs.com", "pragprog.com"]
is_skip_external = false

References


I'm happy to adjust the workflow configuration, add a .docs-validator.toml file with project-specific exclusions, or make any other changes based on maintainer feedback.

Signed-off-by: Aleksandr Nokhrin <a.v.nokhrin@yandex.ru>
@Nokhrin Nokhrin changed the title ci: add GitHub Actions workflow for documentation validation Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant