Crosswalk for Accessibility Remediation - Aligning HTML with Word Styles and Adobe PDF Tags
This article provides a practical crosswalk plus decision rules to help people translate between document structure in Microsoft Word, semantic HTML, and PDF tag structure. It is designed for real-world remediation scenarios where you may need to rebuild a PDF in Word, fix nonsensical PDF tags, or convert structured HTML into an accessible Word document.
Core principle
Accessible structure is defined by meaning and hierarchy, not by typography.
- Headings are defined by the document outline, not font size.
- Lists are defined by list structure, not indentation.
- Tables are defined by cell relationships, not alignment.
- Repeating elements (headers/footers/page numbers) are usually artifacts, not content.
Crosswalk table
| Structural intent | Microsoft Word (accessible structure) | HTML | PDF tag |
|---|---|---|---|
| Document container | N/A | <html><body> |
<Document> |
| Document metadata title | File > Info > Properties > Title | <title> (in <head>) |
PDF Metadata Title (not a tag) |
| Visible document title (top of content hierarchy) | Heading 1 (use once) | <h1> (Canvas uses H1 for pages) |
<H1> |
| Major section heading | Heading 2 | <h2> (First level for Canvas Rich Content Editor) |
<H2> |
| Subsection heading | Heading 3 | <h3> |
<H3> |
| Deeper heading levels | Heading 4–6 | <h4>–<h6> |
<H4>–<H6> |
| Body paragraph | Normal | <p> |
<P> |
| Bulleted list container | Built-in bulleted list | <ul> |
<L> |
| Numbered list container | Built-in numbered list | <ol> |
<L> |
| List item | (created automatically by list tool) | <li> |
<LI> with <Lbl> + <LBody> |
| Table container | Insert Table | <table> |
<Table> |
| Table row | Word row | <tr> |
<TR> |
| Header cell (for columns) | Enable "Header row" on a table | <th scope="col"> |
<TH> |
| Header cell (for rows) | Enable "First column" along with "header row" (Word can't make a header column alone) | <th scope="row"> |
<TH> |
| Data cell | Regular cell | <td> |
<TD> |
| Figure (image/chart/diagram) | Insert Picture (meaningful) | <img> / <figure> |
<Figure> |
| Caption | Caption style | <figcaption> |
<Caption> |
| Block quote | Quote style | <blockquote> |
<BlockQuote> |
| Footnote/endnote content | Insert Footnote/Endnote | (varies) | <Note> |
| Hyperlink | Insert Hyperlink | <a> |
<Link> (contains <OBJR>) |
| Decorative element | Mark decorative / avoid meaning | CSS decorative | <Artifact> |
Decision Rules (Remediating Aesthetics Lacking Structure)
Use these rules to classify content reliably:
- Headings must do real work: A heading should introduce a section with content beneath it. If it doesn’t, it may be a caption or decorative text.
- Repeating elements are not headings: If it repeats on many pages, it’s likely header/footer content or decorative and should be handled accordingly (often Artifact in PDF).
- Lists must be lists: If it walks like a list (bullets/numbering/indentation), it must be represented structurally as a list (Word lists / PDF
<L>and children). - Tables must be tables: If information is arranged in rows/columns, it must be represented structurally as a table (Word table / PDF
<Table>tree). - Reading order is the ground truth: If the reading order is wrong, tags are wrong for users—regardless of how “correct” they look in the tag tree.
