Skip to main content
Version: Version 1.0.0

Monitor the Pdftools OCR Service

You can monitor the OCR service programmatically using health check endpoints exposed by both the Manager and Worker components. To check the readiness status, you can trigger these endpoints:

  • Manager:
    http://localhost:7982/healthz/ready
  • Worker:
    http://localhost:7998/healthz/ready

Check log files to monitor the correct operation of the services. The path to the log files is configurable. Review Default appsettings.json for an example of the configured path to a log file.

Manager configuration overview

The OCR Service Manager uses appsettings.json files to manage its configuration. The following example shows a configuration for the Manager component, which also shows how it interacts with Workers.

The default path of the manager configuration file:

C:\Program Files\Pdftools\Pdftools OCR Service\PdftoolsOcrService\appsettings.json

Default appsettings.json

{
"Database": {
"DatabaseType": "SqlLite",
"DeleteJobsAfterDays": 2
},
"FileStorage": {
"FileStorageType": "HostFileSystem",
"FilesDirectoryPath": "C:/ProgramData/Pdftools/OcrService/Files",
"DeleteFilesAfterDays": 2
},
"ServiceCommunication": {
"ServiceCommunicationType": "Rest",
"ConnectionString": "http://localhost:7998/"
},
"PortNumber": 7982,
"LogFilePath": "C:/ProgramData/Pdftools/OcrService/logs/manager-log.txt",
"LogRetentionDays": 7
}
  • Database
    • DatabaseType: Database backend (for example SqlLite or PostgreSql). If you use PostgreSql, also add ConnectionString.
    • ConnectionString: Connection string for PostgreSql; omitted in the default settings because SqlLite doesn’t require a password.
    • DeleteJobsAfterDays: Number of days after which completed job records are removed.
  • FileStorage
    • FileStorageType: Storage backend (for example HostFileSystem).
    • FilesDirectoryPath: Directory that stores OCR-processed files.
    • DeleteFilesAfterDays: Number of days after which stored files are deleted.
  • ServiceCommunication
    • ServiceCommunicationType: Method the manager uses to reach worker nodes (currently Rest).
    • ConnectionString: Endpoint URL of the worker node or load balancer.
  • PortNumber: Port that the manager listens on; the default is 7982.
  • LogFilePath: Path to the log file.
  • LogRetentionDays: Number of days to retain log files.

Worker configuration overview

The OCR Service Worker uses appsettings.json files to manage its configuration. The following example shows a configuration for the Worker node.

The following is the default path of the worker configuration file:

C:\Program Files\Pdftools\Pdftools OCR Service\PdftoolsOcrWorker\appsettings.json

Default appsettings.json

{
"PortNumber": 7998,
"LogFilePath": "C:/ProgramData/Pdftools/OcrService/logs/worker-log.txt",
"LogRetentionDays": 7,
"FileStorage": {
"FileStorageType": "HostFileSystem",
"FilesDirectoryPath": "C:/ProgramData/Pdftools/OcrService/Files"
},
"ServiceCommunication": {
"ServiceCommunicationType": "Rest"
},
"Licensing": {
"LicenseKey": "<LICENSE_KEY>"
}
}
  • PortNumber: Port used by the worker node for incoming requests, 7998 is the default in this case.
  • LogFilePath: Path to the log file and the retention period for logs.
  • LogRetentionDays: Retention period for logs.
  • FileStorage
    • FileStorageType: Storage system type (for example HostFileSystem).
    • FilesDirectoryPath: Directory path for storing OCR-processed files.
  • ServiceCommunication
    • ServiceCommunicationType: Communication method that reaches the worker (currently Rest).
  • Licensing
    • LicenseKey: Your Pdftools product key. Provide the license key in every worker that you set up.