Version: Version 1.0.0

Pdftools OCR Service

The OCR Service enables optical character recognition (OCR) to extract text from images and scanned documents, transforming them into searchable and editable PDF documents.

You can use the Pdftools OCR Service with the Conversion Service on Windows Server.

Key features

Embeds the recognized text in Unicode format into PDF or PDF/A files.
Supports over 180 natural and technical languages.
Provides an OCR service mode for shared use across multiple platforms.
Predefined and custom OCR profiles to optimize for accuracy or performance.
Automatic skew correction, rotation, and resolution handling.
Detection of tables, barcodes, engineering drawings, and other complex layout elements.

System architecture

The Pdftools OCR Service comprises two .NET Core applications, both running on Kestrel servers:

Manager node: This node handles HTTP requests and dispatches jobs to available workers.
Worker node: Performs the OCR processing. This node handles OCR processing and remains the most resource-intensive component.

This separation allows for flexible deployment and scaling, where a single manager can coordinate multiple workers. For more information, review Scale the Pdftools OCR Service.

System requirements

Worker nodes

These nodes run the OCR engine and require more processing power and memory. The worker node recommended hardware setup is:

Windows 8 or newer (x64)
8 GB RAM or more
Quad-core CPU or better
SSD with at least 4 GB free space

Manager nodes

These nodes coordinate OCR jobs and handle requests, with significantly lower system requirements. The manager node recommended hardware setup is:

Windows 8 or newer (x64)
4 GB RAM
Quad-core CPU
SSD storage

For standalone installations

You can also install the manager and the worker on the same machine. The requirements allocated for Worker nodes suffice.

Licensing

The Pdftools OCR Service comes with two types of license keys:

Trial license key
Full license key

Key features​

System architecture​

System requirements​

Worker nodes​

Manager nodes​

For standalone installations​

Licensing​