Pdftools OCR Service on Linux
Install Pdftools OCR Service on Linux from an RPM or DEB package, set the license key, and then run it as a systemd service.
Get a license key
To get an evaluation or full license key, follow these steps:
- Fill in the Pdftools contact form and mention that you want to evaluate or use Pdftools OCR Service.
- After you receive confirmation, sign up or log in to the Pdftools Portal.
- Click See product next to Pdftools OCR Service, and then copy your license key.
Prerequisites
- A supported Linux distribution. Refer to Linux in the supported operating systems documentation.
- Pdftools OCR Service package for your distribution. Refer to Download the package.
- A valid Pdftools OCR Service license key. Refer to Get a license key.
Download the package
To download Pdftools OCR Service Linux package, follow these steps:
-
Log in to the Pdftools Portal.
-
On the Products page, next to Pdftools OCR Service, click Get started or See product.
-
In the Product builds section, find the package for your distribution, and then click Download:
- RPM (Rocky, AlmaLinux, RHEL, and Oracle Linux 10):
pdftools-ocr-service-VERSION_NUMBER-1.x86_64.rpm - DEB (Ubuntu, Debian):
pdftools-ocr-service_VERSION_NUMBER_amd64.deb
The exact filename, for example
pdftools-ocr-service_1.1.3_amd64.deb, depends on the version you install. - RPM (Rocky, AlmaLinux, RHEL, and Oracle Linux 10):
Install Pdftools OCR Service
Pdftools OCR Service consists of two systemd services: the manager and the worker. You need both for a functional deployment. Install the downloaded package with the matching package manager, and replace VERSION_NUMBER with the version you downloaded (for example, 1.1.3).
RPM
Install the package:
sudo dnf install ./pdftools-ocr-service-VERSION_NUMBER-1.x86_64.rpm
The package isn’t GPG-signed in this release. On hosts with gpgcheck=1 enabled (the default on RHEL-family distributions), pass --nogpgcheck:
sudo dnf install --nogpgcheck ./pdftools-ocr-service-VERSION_NUMBER-1.x86_64.rpm
DEB
Install the package:
sudo apt install ./pdftools-ocr-service_VERSION_NUMBER_amd64.deb
The package isn’t signed in this release. apt doesn’t require a signature for local-file installs, so no extra flag is needed.
Install on a likely supported distribution
If your distribution is listed under Likely supported, the install refuses with has not been validated on. Prepend the following override to the install command for your package format.
For RPM, run the following command:
sudo env PDFTOOLS_SKIP_OS_CHECK=1 dnf install --nogpgcheck ./pdftools-ocr-service-VERSION_NUMBER-1.x86_64.rpm
For DEB, run the following command:
sudo env PDFTOOLS_SKIP_OS_CHECK=1 apt install ./pdftools-ocr-service_VERSION_NUMBER_amd64.deb
Set the license key
The license key is required. To set it, follow these steps:
- Open the worker’s
appsettings.jsonfile:sudo $EDITOR /opt/pdftools/ocr-worker/appsettings.json - Replace the
"<LICENSE_KEY>"placeholder with the license key you copied from Pdftools Portal:{"PortNumber": 7998,"Licensing": {"LicenseKey": "<LICENSE_KEY>"}} - Restore the file’s owner so the
pdftoolsservice user can read it:sudo chown pdftools:pdftools /opt/pdftools/ocr-worker/appsettings.json
In its default configuration, Pdftools OCR Service requires a network connection to validate the license key. For information about partially offline or fully offline solutions, review Pdftools OCR Service licensing, in Pdftools licensing documentation.
For optional configuration, such as ports and file-storage retention, refer to Monitor Pdftools OCR Service.
Start the service and verify
To start Pdftools OCR Service and confirm it’s healthy, follow these steps:
- Start both systemd units:
The package’s post-install script enables both units for boot-time start automatically.sudo systemctl start pdftools-ocr-worker pdftools-ocr-service
- Wait for the worker to load the OCR engine, and then check the readiness endpoint. The first start typically takes 30 to 60 seconds:
A response body ofcurl -fsS http://localhost:7982/healthz/ready
Healthyconfirms the service is ready. A response body ofDegraded(also HTTP 200) means the service is up but the worker engine probe hasn’t completed yet. Wait longer or inspect the worker logs.
Send a test OCR job
The following example uses the DocumentConversion_Accuracy profile, which Conversion Service applies by default when it delegates OCR. For the full list of profiles, languages, and request options, refer to OCR Service parameters.
To send a test OCR job, follow these steps:
- Send a TIFF file from your file system to OCR Service:
curl -X POST "http://localhost:7982/?version=4¶ms=PredefinedProfile%3DDocumentConversion_Accuracy&languages=English&block=true&priority=Normal" \-H "Content-Type: image/tiff" \--data-binary @your-test-file.tif \-o result.xml
- Check the response. A successful response writes the recognized text blocks to
result.xml. A 4xx or 5xx response indicates a configuration problem. Refer to Troubleshooting on Linux.
Uninstall
Learn how to remove Pdftools OCR Service using the steps in the following sections.
RPM
Remove the package with dnf:
sudo dnf remove pdftools-ocr-service
The uninstall removes files under /opt/pdftools/ocr-service/ and /opt/pdftools/ocr-worker/, together with the systemd unit files. By Linux packaging convention, the pdftools system user isn’t removed.
The following items persist after uninstall:
/var/lib/pdftools/files/(job data)/var/log/pdftools/(logs)/opt/pdftools/ocr-worker/appsettings.json.rpmsave(your edited configuration, preserved by RPM convention)
DEB
Remove the package with apt:
sudo apt remove pdftools-ocr-service
To also remove the configuration file, run sudo apt purge pdftools-ocr-service instead. Job data and logs persist in both cases.
Next steps
You installed Pdftools OCR Service on Linux, set the license key, and verified that it’s running. Continue with the following:
- For configuration, service control, logs, and troubleshooting, refer to Monitor Pdftools OCR Service.
- For the full list of profiles, languages, and request options, refer to Pdftools OCR Service parameters.
- To run multiple workers for higher throughput, refer to Scale Pdftools OCR Service.