Scale Pdftools OCR Service
The Pdftools OCR service uses a master-worker architecture. The central master node, called the Pdftools OCR Service Manager, distributes tasks to multiple worker nodes, called Pdftools OCR Service Workers, which perform the actual processing.
In the Pdftools OCR Service installer, you can set up Pdftools OCR Service Manager that communicates with a Pdftools OCR Service Worker. The following diagram illustrates this configuration:

Learn how to scale the Pdftools OCR Service horizontally by configuring the manager to work with multiple worker nodes in this guide.
Scaling worker
The manager node communicates with worker nodes through a RESTful API. In the scaled worker setup, we will create an architecture similar to the following:

- Locate the manager configuration file. In a default installation, the file is located at:
C:\Program Files\Pdftools\Pdftools OCR Service\PdftoolsOcrService\appsettings.json
- Point
ServiceCommunication
to your load balancer:{
"ServiceCommunication": {
"ServiceCommunicationType": "Rest",
"ConnectionString": "http://localhost:8080/"
}
} - Install workers on different host machines. No need to install the manager again.
- Configure a load balancer to distribute requests to your worker nodes.
- Locate the worker configuration file. In a default installation, the file is located at:
C:\Program Files\Pdftools\Pdftools OCR Service\PdftoolsOcrWorker\appsettings.json
- Add a license key to each worker
In a default installation, the worker configuration file is located at:
{
"Licensing": {
"LicenseKey": "<LICENSE_KEY>"
}
}C:\Program Files\Pdftools\Pdftools OCR Service\PdftoolsOcrWorker\appsettings.json
- Share a common file storage between all manager and worker nodes:
{
"FileStorage": {
"FileStorageType": "HostFileSystem",
"FilesDirectoryPath": "F:/SharedFolder/ProgramData/Pdftools/OcrService/Files"
}
}FileStorage
FileStorageType
: Storage system type (for example,HostFileSystem
).FilesDirectoryPath
: Directory path for storing OCR-processed files.
- Make sure every manager and all worker nodes:
- Uses the same
FileStorage
settings. - Has read and write permission for the shared directory.
- Uses the same
Scaling manager
You can scale the manager nodes similarly to the worker nodes.
Switch to PostgreSQL instead of SQLite to support horizontal scaling of manager nodes. The PostgreSQL database lets you share the state of multiple managers. For more information, review Default appsettings.json.

The following steps add to the previous section, where worker nodes were already scaled.
- Locate the manager configuration file. In a default installation, the file is located at:
C:\Program Files\Pdftools\Pdftools OCR Service\PdftoolsOcrService\appsettings.json
- Configure PostgreSQL:
{
"Database": {
"DatabaseType": "PostgreSql",
"ConnectionString" : "User ID=myUser;Password=mySecurePassword;Server=my.database.com;Port=5432;Database=ocr-service-db;",
"DeleteJobsAfterDays": 2
}
} - Install manager nodes on separate hosts.
- Configure a load balancer for the manager node.