PDFs
What are PDFs?
PDFs, or Portable Document Format files, are designed to preserve the layout, formatting, and content of a document across different devices and platforms. They are commonly used because they ensure consistent presentation of content such as text, images, links, and forms.
Content creators typically generate PDFs using one of two methods. First, they can export an original document created in software such as Microsoft Word, PowerPoint, or Excel to a PDF format. Second, they can scan a physical document as an image-based PDF. While both types of PDFs can be inaccessible, scanned PDFs are especially problematic. As an image-based file, scanned PDFs are unreadable by screen readers and other assistive technology.
Benefits of Fixing Inaccessible PDFs
Accessible PDFs benefit all users from a Universal Design (UD) perspective. PDFs that have been made accessible are easier to navigate, are searchable, and can be accessed by text-to-speech (TTS) or screen reader software.
Accessible PDFs can also help decrease the need for students to request accommodations. Students who commonly rely on a text-to-speech (TTS) or screen reading software to access their course readings often use assistive technologies such as NVDA or JAWS to listen to their course readings. If you’ve fixed your PDFs, students will be less likely to request alternative format accommodations from Disability Services for Students (DSS).
Exported PDFs
Creating an accessible document is always easiest when working in the original software. Making changes to a PDF is challenging and often requires additional training. Whenever possible, return to the source document to fix any issues before converting it to a PDF.
Follow the accessibility action steps for each content type (Microsoft Word, Excel, and PowerPoint) before converting these documents to PDFs.
Enable Accessibility Tags When Saving
There are several ways to convert Microsoft Documents to a PDF, but not all ways preserve the accessibility of the original document in PDF form. Moreover, not all file types support accessible conversions. At this time, converting PowerPoint documents to accessible, tagged PDFs is not available. If you choose to export PowerPoint documents to PDFs, they must be fully remediated using Acrobat Pro DC.
To preserve the accessibility of the original document, follow the instructions outlined in the tutorials below.
- Preserving Accessibility When Exporting Documents to PDFs (Windows Users)
- Preserving Accessibility When Exporting Documents to PDFs (Mac Users)
Scanned-In PDFs
When content creators scan a printed page into a PDF, the resulting file contains no recognizable text—only an image of the text. You can identify an image-only PDF if the text cannot be selected or searched within the document. For screen readers, these PDFs are treated as images rather than text. While a single image can be made accessible with a brief alt-text description, this approach is not feasible for an entire scanned PDF, regardless of its length.
Scanned PDF Example
Introduction to Scanned in PDFs
Fixing Scanned-In PDFs
Unlike PDFs you’ve made by exporting an original text to a PDF, scanned PDFs require much more remediation to make them accessible. The first step in making a scanned PDF accessible is to convert the image to text using optical character recognition (OCR) technology, such as Adobe Acrobat Pro. OCRing a document essentially takes the flat image of a scanned-in PDF and makes it both searchable and editable. You will know that a PDF has been OCR'd if you can select text. For example, users can copy and paste information from the PDF. While OCRing a document makes it accessible to screen reader users, it does not fix any of the other problems of a scanned-in PDF, such as incorrect or missing tag structure, missing headings, etc.
OCR Considerations
Here are some things to consider when fixing image-only PDF documents.
- Converting a PDF using OCR technology requires special software and programs, such as Adobe Acrobat Pro. These programs can be costly, so you may not be able to access these programs.
- In addition to money, fixing scanned PDFs often takes a considerable amount of time. OCR might fix some issues, but scanned PDFs will likely require further remediation. This takes both a considerable amount of time and training.
- To avoid an initial investment of time and money, it may be better (and easier) to reconsider using a scanned document. Check out other alternatives, instead. Is there a digital version of the text available? Is the text outdated and replaceable by a more current source?
- Consider using a permalink to the UND Library System that can provide multiple formats of the document?
OCR in Blackboard
To create an OCRed PDF without paying for a pricey PDF editor such as Adobe Pro, use Blackboard's Alternative Formats. Check out our tutorial on how to OCR Documents Using Blackboard.
Adobe Acrobat Pro
For guidance on how to work with PDFs in Adobe Acrobat Pro, review these steps:
All PDFs should have the title and language set in Document Properties. The title should be concise and meaningful, reflecting the document's purpose. The title can be used to quickly identify the intent of a document without opening it. Meanwhile, the document language indicates the spoken language in which the text is written. Correctly set language properties allow screen readers and assistive technologies to accurately read and interpret the content.
Add Metadata
A document's title and language is part of its metadata. Metadata is the descriptive information embedded within a PDF file that helps identify and organize the document. This data is not visible to the eye; rather, it is contained within the code. Additional metadata includes the author's name, a subject description, and keywords. This metadata helps users and systems quickly locate the document in a library, database, or search engine by matching the keywords to search queries. In Adobe Acrobat, this additional data can be entered in the same dialogue box as the document title.
Some PDF security settings may prevent screen reader users from accessing accessibility features. Check your security permissions in Document Properties to ensure that “Protected View” and “Enhanced Security” is disabled. For most documents, the "No Security" option is preferred.
Tagging PDFs is crucial for document accessibility. Document tags provide an underlying formatting structure that defines the document's layout and enables screen readers to navigate the content more effectively. These tags make it possible for screen readers to identify elements such as headings, lists, tables, etc.
Inspect the Tag Tree
Accessible documents that have been correctly exported as a PDF will contain an existing tag tree. The HTML code created when applying styles in Microsoft Word, PowerPoint, and InDesign translates to tags in Adobe Acrobat. While the tag tree should reflect the content's structure, always inspect it to ensure proper tagging.
To view the tag tree, navigate to the hamburger Menu (Windows) or the View menu (Mac). Hover over the Show/Hide option; a new menu should appear. Hover over the Side Panels option; a list of tools should appear. Select Tags. This will display the document's tags in order from top to bottom. Review all tags to ensure proper semantic structure and reading order.
Delete empty Tags
Keep tag trees clean by deleting empty tags. Screen readers will read all tags, even those without content. This can cause unnecessary confusion in navigation.
Common PDF Tags
Container / Group Tags
Container tags help group other tags. These tags are not required, but they can be
useful in organizing page structure and improving document navigation. All container
and group tags exist under the <Document> tag.
Tag | Name | Purpose | Image |
---|---|---|---|
<Document> | Document | Main document tag under which all tags are nested | ![]() |
<Part> | Part | Divides the document into major sections (e.g., chapter or report) | ![]() |
<Sect> | Section | Divides parts of a document into groups |
|
Text Tags
Text tags designate text elements utilized in the body of the document. They define and structure document elements, creating hierarchy and improving content readability and navigation.
Tag | Name | Purpose | Image |
---|---|---|---|
<H1> | Heading 1 | A document's title | ![]() |
<H2> | Heading 2 | Main level heading | ![]() |
<H3> - <H6> | Heading 3 - Heading 6 | Subheadings | ![]() |
<P> | Paragraph | Body text | ![]() |
<BlockQuote> | Quote | Quote contained in its own paragraph | ![]() |
<L> | List | Tag under which all list items are nested | ![]() |
<LI> | List Item | Contains list item elements <Lbl> and <LBody> | ![]() |
<Lbl> | Label | The number or bullet character associated with a list item | ![]() |
<LBody> | Label Body | Text associated with a list item | ![]() |
<Link> | Hyperlink | Link to a webpage or document | ![]() |
<OBJR> | Object Reference | Nested under a <Link> tag; it is the active URL link | ![]() |
<TOC> | Table of Contents | Tag under which all Table of Contents entries are nested | ![]() |
<TOCI> | TOC Item | Entry within a table of contents; it houses the <Reference> and <Link> tags. | ![]() |
<Reference> | Reference | Internal link (e.g., footnote or TOC) | ![]() |
Table Tags
Table tags are structural elements specific to creating tables. While tables exist in the body of a document and contain text elements, they are distinct from the body text. Table tags are used to define table structure, creating the grid-like layout we see here.
Tag | Name | Purpose | Image |
---|---|---|---|
<Table> | Table | Tag under which all table tags are nested | ![]() |
<TR> | Table Row | Groups items in a table row | ![]() |
<TH> | Table Header | Heading cells within a row | ![]() |
<TD> | Table Data | Data cells with a row | ![]() |
Figure Tags
Figure tags contain all image-related content. Therefore, all figure tags should have descriptive alternative text embedded into the tag. Images tagged as <Artifact> will have no place to input alternative text.
Tag | Name | Purpose | Image |
---|---|---|---|
<Figure> | Figure | Photo or graphic (e.g., logo, illustration, photo, map, chart, etc.) | ![]() |
<Formula> | Formula | Mathematical formula |
Reading order and tag structure are closely related. A document's reading order is the sequence in which content tags are read. A correct tag tree ensures a logical reading order, which is crucial for screen readers and assistive technology. Ensuring the correct reading order prevents potential confusion, as the visual layout may not always match the intended reading order. Complex layouts, particularly those with tables or multi-column designs, can disrupt the intended reading order. For more information, see Adobe Support's documentation on Reading Order.
Run the Accessibility Checker to test for accessibility issues. This checker verifies if your document conforms to prevailing accessibility standards, such as PDF/UA and WCAG 2.0. It will prompt you to fix any issues it finds. Running the Accessibility Checker should be both the first and last thing you do in a PDF. For more information, see Adobe Support's documentation on Verifying PDF Accessibility.
Tagging Documents in Adobe Acrobat (Tutorial)
Adobe Acrobat allows you to customize the side panel menu to view your frequently used tools. To add the Tags window to your side panel, navigate to the View menu.
Select (Windows) or hover over (mac) the Show/Hide option. A sub-menu should appear. Select or hover over Side Panels. A second sub-menu should appear. Select Accessibility Tags.
The Tags window should now show in your side panel menu on the right side of the screen. Tags will be displayed in order from top to bottom. Review all tags to ensure proper semantic structure and reading order.
Open the Tags window in the Side Panel menu. Navigate to the three dots in the corner of your Tags window, indicating menu options. Select Reading Order.
The Reading Order tool dialogue box should open. In this box, you can designate content tags. Correctly designating tags creates a logical reading order.
To change a tag, click the content box you would like to change. Then, select the new tag.
Drag and Drop
Open the Tags window in the Side Panel menu. Select the tag you would like to move. Drag and drop the tag. As you drag the tag, a black line will appear. This line indicates places you may drop the tag.
Cut and Paste
Open the Tags window in the Side Panel menu. Select the tag you would like to move. Navigate to the three dots in the corner of your Tags window, indicating menu options. Select Cut.
Next, right click on the tag that you want to move your cut tag above. Navigate to the three dots in the corner of your Tags window, indicating menu options. Select Paste. The cut tag will now be pasted above the tag you selected.
Open the Tags window in the Side Panel menu. Select the tag you would like to delete. Navigate to the three dots in the corner of your Tags window, indicating menu options. Select Delete Tag.
To manually tag a document, navigate to the Tags window in the Side Panel menu. Click on the three dots in the corner of your Tags window, indicating menu options. Select Create Tags Root.
Navigate back to the three dots in the corner of your Tags window. Select Reading Order.
The Reading Order tool dialogue box should open. In this box, you can designate content tags. Use your mouse to drag a box around the content you want to tag. Pink lines should demarcate which text has been selected. Once the text is selected, choose which tag you would like to apply.
Once a tag has been selected, the content will be numbered, and it will show up in the Tags Pane
Open the Tags window in the Side Panel menu. Click on the three dots, indicating menu options. Select Autotag document. This option only appears if there are no document tags available.
While autotagging seems like the simple option, it can create trouble. Review all tags to ensure proper semantic structure and reading order.
Resources
- Watch TTaDA's Introduction to PDFs.
- Use the Accessibility Checklist for PDF Documents to gauge how compliant your course resources are with Section 508 of the Americans with Disabilities Act, Title II regulations, and WCAG 2.1 AA guidelines.
- Get step-by-step instructions and best practices for making your PDF documents accessible with Adobe Acrobat's help page on Accessibility Features in PDFs.
- For more help, see Section 508's mini-series How to Test and Remediate PDFs for Accessibility Using Adobe Acrobat. This series explains and demonstrates steps you can take to ensure your PDF document is accessible.