Worker
The Worker performs detection and redaction. Only the Manager calls it; there is no externally exposed API. Set environment variables on the Worker container to configure it; the default port is 4885. The configuration applies per Worker instance, so run multiple Workers to scale throughput. For the naming convention and shared notes, refer to Configuration reference.
The Worker shares two configuration sections with the Manager: FileStorage and Encryption. Both services must hold the same values.
Default appsettings.json
{
"WebServer": {
"PortNumber": 4885,
"MaxFileSizeBytes": null,
"MaxConcurrentConnections": 1000,
"RequestHeadersTimeout": null,
"KeepAliveTimeout": null,
"MinRequestBodyDataRateBytesPerSecond": null,
"MinRequestBodyDataRateGracePeriod": null
},
"LogFilePath": "./logs/smart-redact-worker-log.txt",
"LogRetentionDays": 7,
"FileStorage": {
"FileStorageType": "HostFileSystem",
"FilesDirectoryPath": "/app/storage_folder"
},
"Encryption": {
"EncryptionKey": "<ENCRYPTION_KEY>",
"DekTokenTtlMinutes": 1440
},
"ServiceCommunication": {
"ServiceCommunicationType": "Rest"
},
"Inference": {
"ExecutionProvider": "Auto",
"GpuDeviceId": 0,
"CpuUtilizationPercentage": 80,
"GraphOptimizationLevel": "All",
"ExecutionMode": "Parallel",
"MaxChunkSize": 256,
"MaxLength": 512,
"MaxWidth": 12,
"BatchSize": 1
},
"Licensing": {
"LicenseKey": "<LICENSE_KEY>",
"LgsURL": ""
}
}
Each section is described below.
Licensing
The Worker validates the license at startup and exits if the key is missing or invalid.
Licensing__LicenseKey=<LICENSE_KEY>
| Setting | Default | Description |
|---|---|---|
LicenseKey | required | The AI Smart Redact license key issued by Pdftools. Must match the Manager. |
LgsURL | — | Optional URL of an on-premise License Gateway Service for air-gapped deployments. |
File storage
The Worker reads input files from and writes output files to the same store as the Manager. The fields and accepted values are the same. Refer to File storage on the Manager configuration page.
Encryption
The Worker uses the same encryption key as the Manager to unwrap DEK tokens received with each job. The fields and accepted values are the same. Refer to Encryption on the Manager configuration page.
Service communication
The transport configured here must match the Manager’s.
Transport
For RabbitMQ, the Worker connects to the same broker as the Manager.
ServiceCommunication__ServiceCommunicationType=RabbitMQ
ServiceCommunication__Host=<RABBITMQ_HOST>
ServiceCommunication__Username=<USERNAME>
ServiceCommunication__Password=<PASSWORD>
For REST transport, only the type is set on the Worker; the Manager initiates all calls to the Worker’s HTTP endpoints on the configured WebServer port.
ServiceCommunication__ServiceCommunicationType=Rest
| Setting | Default | Description |
|---|---|---|
ServiceCommunicationType | required | Rest or RabbitMQ. Must match the Manager. |
Host | required for RabbitMQ | Broker host name. |
Username | required for RabbitMQ | Broker username. |
Password | required for RabbitMQ | Broker password. |
Concurrency
Caps on how many jobs each Worker instance processes in parallel. Detection holds an inference slot for the duration of the job, so running multiple detections in parallel adds memory pressure without improving throughput; the default is 1. Redaction is lighter and runs up to four jobs in parallel by default.
ServiceCommunication__DetectionConcurrencyLimit=1
ServiceCommunication__RedactionConcurrencyLimit=4
| Setting | Default | Description |
|---|---|---|
DetectionConcurrencyLimit | 1 | Maximum detection jobs processed concurrently by this Worker instance. |
RedactionConcurrencyLimit | 4 | Maximum redaction jobs processed concurrently by this Worker instance. |
Inference
The Worker runs a semantic detection model for context-aware entity recognition. The Inference section tunes the inference runtime and the chunking behavior.
Inference__ExecutionProvider=Auto
Inference__GpuDeviceId=0
Inference__CpuUtilizationPercentage=80
Inference__GraphOptimizationLevel=All
Inference__ExecutionMode=Parallel
Inference__BatchSize=1
Inference__MaxChunkSize=256
Hardware
| Setting | Default | Description |
|---|---|---|
ExecutionProvider | Auto | Auto uses GPU when running the -cuda Worker image, otherwise CPU. Cpu forces CPU inference. |
GpuDeviceId | 0 | Index of the GPU to use when ExecutionProvider resolves to a GPU. |
CpuUtilizationPercentage | 80 | Percentage of available CPU cores the runtime uses for inference. Range 1–100. |
Runtime
| Setting | Default | Description |
|---|---|---|
GraphOptimizationLevel | All | Graph optimization level: DisableAll, Basic, Extended, or All. |
ExecutionMode | Parallel | Sequential or Parallel. |
BatchSize | 1 | Number of text chunks sent to the model per inference call. Range 1–100; values outside the range are clamped. Higher values increase throughput at the cost of memory. |
Chunking
Long text is split into chunks before inference.
| Setting | Default | Description |
|---|---|---|
MaxChunkSize | 256 | Maximum tokens per chunk. Higher values give the model more context; lower values reduce per-chunk latency. Clamped to MaxLength if set higher. |
MaxLength | 512 | Hard upper bound on input length in tokens supported by the model. Don’t increase beyond what the configured model accepts. |
MaxWidth | 12 | Maximum span width in words for a single detected entity. |
Web server
WebServer__PortNumber=4885
| Setting | Default | Description |
|---|---|---|
PortNumber | 4885 | TCP port the Worker listens on. The Manager calls this port only when ServiceCommunicationType is Rest. |
MaxFileSizeBytes | null (no limit) | Maximum allowed body size on Worker endpoints. The Worker doesn’t accept user uploads, so the limit is normally left unset. |
MaxConcurrentConnections | 1000 | Maximum concurrent connections accepted by Kestrel. |
The remaining Kestrel limits (RequestHeadersTimeout, KeepAliveTimeout, MinRequestBodyDataRateBytesPerSecond, MinRequestBodyDataRateGracePeriod) accept the same values as on the Manager. Refer to Web server on the Manager configuration page.
Logging
Application logs are written to the console and, optionally, to a file. The fields are top-level (no section prefix).
LogFilePath=./logs/smart-redact-worker-log.txt
LogRetentionDays=7
| Setting | Default | Description |
|---|---|---|
LogFilePath | — | Path of the rolling-daily log file inside the container. Leave empty to disable file logging. |
LogRetentionDays | 7 | Number of days log files are retained on disk. |
The minimum log level isn’t a separate setting. It’s derived from the standard ASPNETCORE_ENVIRONMENT environment variable: when set to Development, the service emits Debug-level logs in a developer-friendly console format; any other value (the default) emits Information-level logs in JSON. Use Development only for local diagnostics.