Skip to main content

Split PDF document

The Pdftools SDK lets you split a single input PDF document into multiple output PDF documents and images. During this process, only the resources required by each page are copied to the output document containing that page. This ensures your output PDF files do not contain redundant or potentially sensitive information.

Quick start with a code sample

Download the full sample now in C, C#, Java, Python, and Visual Basic.

Interested in Go, Rust or other programming language samples? Let us know on the contact page and we’ll add it to our samples backlog.

Depending on the requirements, you can adjust the characteristics of the output document by setting the PageCopyOptions Class used in the assembly process.

You can also generate the output documents as images by converting a PDF document to an image.

Steps to split PDF documents:

  1. Opening the input Document
  2. Creating the DocumentAssembler object
  3. Appending to the output document
  4. Running the Assemble method
  5. Full example

Before you begin

Opening the input Document

Read the PDF document you want to convert. To do this, load the input document from the file system into a (read-only) PDF Document.

// Open input document
using var inStream = File.OpenRead(inPath);
using var inDoc = PdfTools.Pdf.Document.Open(inStream);

Creating the DocumentAssembler object

Create the DocumentAssembler object that will generate the output PDF document. To do this, instantiate the DocumentAssembler and pass it an output Stream (for example, a file or memory stream) that will contain the output data.

The following example creates one output PDF document for each input document page.

// Repeat for each page in the input document
for (int i = 1; i <= inDoc.PageCount; ++i)
{
// Create the output stream and pass it to the document assembler
using var outStream = File.Create(outPathPrefix + i + ".pdf");
using var docAssembler = new PdfTools.DocumentAssembly.DocumentAssembler(outStream);

Appending to the output document

You can select a page range to copy from the input Document by passing firstPage and lastPage parameters to the Append method of the DocumentAssembler object.

In this example, we only append the current page of the input PDF document to each output document.

// Append the current page of the input PDF document to a single-page output document
docAssembler.Append(inDoc, i, i);

Running the Assemble method

After using the Append method to add the required pages to the output PDF document, the final step is to call the Assemble method. This method creates the structure of the output PDF document and writes the document to the output Stream of the DocumentAssember object.

// Create the final structure of the output PDF document and write it to the output stream
docAssembler.Assemble();
tip

Don’t forget that some objects (like the Document object) must be explicitly closed. For these objects, we recommend using the mechanism for automatically closing objects.

Full example

// Open input document
using var inStream = File.OpenRead(inPath);
using var inDoc = PdfTools.Pdf.Document.Open(inStream);

// Repeat for each page in the input document
for (int i = 1; i <= inDoc.PageCount; ++i)
{
// Create the output stream and pass it to the document assembler
using var outStream = File.Create(outPathPrefix + i + ".pdf");
using var docAssembler = new PdfTools.DocumentAssembly.DocumentAssembler(outStream);

// Append the current page of the input PDF document to the output document
docAssembler.Append(inDoc, i, i);

// Create the final structure of the output PDF document and write it to the output stream
docAssembler.Assemble();
}