Preserve documentation deep links across layout changes #134
Description
(Tooling request extracted from python/cpython#126053 and python/cpython#126052)
One of the barriers to making significant structural changes to the CPython docs is that we're likely to break deep links by doing so. For example, if https://docs.python.org/3/library/stdtypes.html were to be split up into per-category or per-type pages, any links to specific sections like String methods would just break (giving either 404 or linking to the top of the page instead of the desired information). We don't want to do that.
At the same time, not being able to refactor these pages poses significant problems for documentation readability (in the two linked examples, one of the pages contains 18+ thousand words, and the other is 25k+).
In python/cpython#126053 (comment), we identified a potential technical mitigation that would allow moving link targets between pages, or making other changes (like updating section headings), without necessarily breaking deep links to those anchors:
- define a way to essentially do an "anchor diff" between two versions of a set of docs to find anchors and pages which used to exist but will no longer resolve (for example, define https://docs.python.org/dev/ as the reference docs for
main
, and compare each new build to those. It might be sufficient to use the existing intersphinx inventory as the basis for comparison). - define a way to map removed anchors on affected pages to new targets (targets should be Sphinx semantic references). This may be a new Sphinx extension with a custom directive like
.. anchormap::
, or it may be something else. - when a page has an anchor map defined, inject the client side JS to intercept stale links and generate the relevant JS redirect request (if the page has no anchor map, there's no need to inject that JS snippet).
- add a docs CI check that fails if anchors are removed relative to the baseline docs without an anchor map entry being defined
We may also want to provide guidance on implementing full page redirects (along the lines of https://github.com/pypa/packaging.python.org/blob/main/source/guides/single-sourcing-package-version.rst?plain=1), as the tooling would need to be aware of those to avoid having them show up as broken deep links.