Skip to main content

Create an accessible PDF from scratch

Using the Toolbox add-on, create fully accessible, tagged PDF documents from scratch. This guide walks you through the technical steps to create a valid PDF/UA document.

A properly structured PDF is crucial not only for users with disabilities who rely on assistive technologies but also improves the experience for all users by enabling features like content reflow on mobile devices and improving machine-readability.

info

This functionality is part of the Toolbox add-on, a separate SDK that you can use with the same license key as the Pdftools SDK. To use and integrate this add-on, review Getting started with the Toolbox add-on and Toolbox add-on code samples.

Quick start

Download the full sample now in C#, Java, and Python.

For background on PDF accessibility concepts and the importance of logical structure, review A primer on PDF accessibility.

The process involves creating a document, defining its logical structure, and then adding content elements that are linked to that structure. Steps to create a tagged PDF document:

  1. Create the document and structure tree
  2. Add and tag content
  3. Setting accessibility metadata

Before you begin

1. Create the document and structure tree

You always start with an empty output Document.

// Create a PDF document
using Stream outStream = new FileStream(outPath, FileMode.Create, FileAccess.ReadWrite);
using Document outDoc = Document.Create(outStream, null, null);

// Create a font
Font font = CreateFontWithFallbacks(outDoc, ARIAL_AND_FALLBACKS);

// Create a page
Size pageSize = new Size { Width = ToPoints(21, "cm"), Height = ToPoints(29.7, "cm") }; // DIN A4
Page outPage = Page.Create(outDoc, pageSize);

// Generate the page's content
CreateAndTagContent(outDoc, outPage, imagePath, font);

outDoc.Pages.Add(outPage);
Helper functions and constants

Helper functions like createFontWithFallbacks or toPoints and constants like ARIAL_AND_FALLBACKS can be viewed in the full example.

The first step is to establish the logical structure tree. This is essential because content can only be tagged if a structure node already exists for it to reference.

The structure tree begins with a Tree object, whose root is the DocumentNode. All other structure elements, like sections (Sect) and paragraphs (P), become children of this root.

private static void CreateAndTagContent(Document outputDoc, Page outPage, string imagePath, Font font)
{
using (ContentGenerator gen = new ContentGenerator(outPage.Content, false))
{
// Create an empty logical structure tree and add a section to the root node (DocumentNode)
Tree structTree = new Tree(outputDoc);
Node docNode = structTree.DocumentNode;
Node sectionNode = new Node("Sect", outputDoc, outPage);
docNode.Children.Add(sectionNode);

// Start from the top of the page with margin
double currentY = outPage.Size.Height - MARGIN;

// Create header
currentY = CreateAndTagText(
outputDoc, outPage, gen, sectionNode, font, currentY,
"H1", "This is a properly tagged heading", 24.0);

// Add padding and create paragraph
currentY -= PADDING;
currentY = CreateAndTagText(
outputDoc, outPage, gen, sectionNode, font, currentY,
"P", "This is a properly tagged paragraph. Both heading and paragraph belong to a section.", 12.0);

// Add padding and create image
currentY -= PADDING;
CreateAndTagImage(outputDoc, outPage, gen, imagePath, currentY);
}
}

2. Add and tag content

With the structure tree in place, you can create page content and associate it with a specific structure node. This is done using a ContentGenerator and its TagAs method. The workflow is:

  1. Start tagging by calling gen.TagAs(node).
  2. Add the content (e.g., paint text or an image).
  3. Stop tagging with gen.StopTagging().

2.1. Tagging text

When tagging text elements like headings or paragraphs, the ActualText property of the Node should be set. This provides an explicit text equivalent for screen readers.

private static double CreateAndTagText(
Document outputDoc, Page outPage, ContentGenerator gen, Node sectionNode,
Font font, double topY, string tagName, string textContent, double fontSize)
{
// Create a node in the logical structure tree
// that will contain the text and add it to the section
Node textNode = new Node(tagName, outputDoc, outPage);
textNode.ActualText = textContent;
sectionNode.Children.Add(textNode);

// Calculate text baseline position
double baselineY = topY - fontSize * font.Ascent;

// From here on, all painted graphics will be contained within that node
gen.TagAs(textNode);

// Paint text onto the page
Text text = Text.Create(outputDoc);
using (TextGenerator textGen = new TextGenerator(text, font, fontSize, null))
{
Point position = new Point { X = MARGIN, Y = baselineY };
textGen.MoveTo(position);
textGen.ShowLine(textNode.ActualText);
}
gen.PaintText(text);

gen.StopTagging();

// Return bottom coordinate (baseline - descent)
return baselineY - fontSize * font.Descent;
}

2.2. Tagging an image

For non-text content, providing a text alternative is critical. For images (Figure nodes), you must set the AlternateText property.

private static double CreateAndTagImage(
Document outputDoc, Page outPage, ContentGenerator gen,
string imagePath, double topY)
{
// If a structure tree has already been created for the given document,
// the existing structure tree reference will be returned.
Tree tree = new Tree(outputDoc);
Node docNode = tree.DocumentNode;
Node figureNode = new Node("Figure", outputDoc, outPage);
figureNode.AlternateText = "PdfTools AG Logo";
docNode.Children.Add(figureNode);

// From here on, all painted graphics will be contained within that node
gen.TagAs(figureNode);

// Load an image
Image image;
using (Stream inImage = new FileStream(imagePath, FileMode.Open, FileAccess.Read))
{
image = Image.Create(outputDoc, inImage);
}

// Paint the image onto the page
double x = MARGIN;
double width = ToPoints(2.0, "cm");
double height = width * image.Size.Height / image.Size.Width; // preserve aspect ratio
Rectangle rect = new Rectangle
{
Left = x, // left
Bottom = topY - height, // bottom (Rectangle coordinates: bottom is lower than top)
Right = x + width, // right
Top = topY // top
};
gen.PaintImage(image, rect);

gen.StopTagging();

// Return bottom coordinate
return topY - height;
}

When writing alternative text, follow these best practices:

  • Be descriptive but concise
  • Convey the purpose of the image, not just its appearance
  • Don’t start with “Image of…” or “Picture of…”
  • For complex diagrams, consider providing a longer description elsewhere

3. Setting accessibility metadata

The SDK automatically sets the Tagged flag when you create a PDF’s logical structure. However, you must explicitly set the PDF/UA flag when you’re confident your document meets all accessibility requirements:

// Set document metadata
outDoc.Metadata.Title = "Accessible Document Example";
outDoc.Metadata.Author = "Your Organization";

// Set language for the entire document
outDoc.Language = "en-US";

// Only set PDF/UA flag after ensuring full accessibility compliance
// This requires human verification beyond technical implementation
// outDoc.SetPdfUaConformant(); // Uncomment only after full review
PDF/UA compliance

Setting the PDF/UA flag indicates that your document fully complies with the PDF/UA standard. This goes beyond technical structure and includes:

  • Proper color contrast ratios
  • Meaningful reading order
  • Appropriate use of headings
  • Complete alternative descriptions
  • Correct language specifications

Always perform human review before claiming PDF/UA compliance.

Full example

Download the full example to see all the pieces working together.

Next steps

After creating accessible PDFs:

  • Validate accessibility: Use PDF accessibility checkers to verify your implementation
  • Test with assistive technology: Try your PDFs with screen readers like NVDA or JAWS
  • Learn about remediation: See our guide on adding structure to existing PDFs
  • Explore structure reading: Learn how to read and analyze logical structure
info

Remember: Creating truly accessible PDFs requires both technical implementation and human understanding. While our SDK provides the tools, achieving genuine accessibility requires thoughtful consideration of how people will use your documents.