Skip to main content

Detection

The Detection endpoints run sensitive-entity detection against a previously uploaded PDF. The response contains the detected entities (text, label, page position, and confidence score) inline in the result object, and a reference to an FDF file produced by the job and stored on the Manager. Both are needed to start a redaction job: the entity list is passed inline as redactionInput, and the FDF reference is passed as fdfFileId and fdfDekToken. The FDF itself only needs to be downloaded when the entities need to be reviewed or modified visually in a PDF viewer before redaction. The cURL examples use the Manager’s default address (http://localhost:9982); substitute the host and port of your deployment as needed.

Sync and async processing

Each detection request specifies a processingMode:

  • sync — the request blocks until detection completes. The response is 200 with the full result.
  • async — the request returns 202 immediately with a jobId. Use the result endpoint to poll until jobStatus is finished or error.

Use sync for small, interactive flows. Use async for large documents that would exceed HTTP timeouts in sync mode, or when starting many jobs in parallel.

Detection configuration

The optional detectionConfiguration field on the request overrides the built-in detection defaults for that single request. Use it to add custom recognizers, change the score threshold, or supply keyword exclusions. When omitted, the built-in defaults are used.

For the full schema and the available options, refer to Detection configuration.

Endpoints

Start a detection job

POST /v1/jobs/detection

Detects sensitive entities in a previously uploaded PDF. The behavior depends on processingMode: sync blocks until the result is ready; async returns immediately with a jobId for polling.

Request bodyDetectionRequest.

ResponseDetectionResponse.

Status codes:

CodeMeaning
200Sync only. The job completed and the response contains the full result.
202Async only. The job was accepted and is running. Poll the result endpoint with the returned jobId.
400The request was malformed, or pdfFileId and dekToken don’t match an existing file.
404No file with the given pdfFileId exists.
429Admission control rejected the request because too many jobs are pending.
503The job couldn’t be dispatched to a worker.

Sync example:

curl -X POST "http://localhost:9982/v1/jobs/detection" \
-H "Content-Type: application/json" \
-d "{\"processingMode\": \"sync\", \"pdfFileId\": \"$FILE_ID\", \"dekToken\": \"$DEK_TOKEN\"}"

Async example:

curl -X POST "http://localhost:9982/v1/jobs/detection" \
-H "Content-Type: application/json" \
-d "{\"processingMode\": \"async\", \"pdfFileId\": \"$FILE_ID\", \"dekToken\": \"$DEK_TOKEN\"}"

Get the result of a detection job

GET /v1/jobs/detection/{jobId}/result

Returns the current state of an async detection job. Poll this endpoint until jobStatus is finished or error. The response uses the same shape whether the job is still in progress or has completed.

Path parameters:

NameTypeDescription
jobIdstring (UUID)The jobId returned by the start-job request.

ResponseDetectionResponse.

Status codes:

CodeMeaning
200The job has finished. The response contains the full result.
202The job is still running. Poll again.
400The request was malformed.
404No job with the given jobId exists.

Example:

curl "http://localhost:9982/v1/jobs/detection/$JOB_ID/result"

Schemas

DetectionRequest schema

FieldTypeDescription
processingModeProcessingModeWhether the request blocks until the job completes (sync) or returns immediately for polling (async).
pdfFileIdstring (UUID)Identifier of the previously uploaded PDF, returned by an upload endpoint.
dekTokenstringDEK token returned alongside the pdfFileId at upload time.
detectionConfigurationDetectionConfigurationOptional. Overrides the built-in detection defaults for this request. When omitted, the built-in defaults are used.

DetectionResponse schema

FieldTypeDescription
jobIdstring (UUID)Identifier of the job. Use it to poll the result endpoint.
jobTypeJobTypeAlways detection for this endpoint.
jobStatusJobStatusCurrent state of the job.
errorApiErrorResponseSet when jobStatus is error. Otherwise omitted.
outputFilesarray of FileResultReferences to files produced by the job. For detection, contains one entry pointing at the FDF file with the detected entities as visual annotations. Pass fileId and dekToken as fdfFileId and fdfDekToken on a redaction request. Empty until the job is finished.
resultDetectionResultDetected entities returned inline. The same set is encoded in the FDF file referenced by outputFiles. Empty until the job is finished.

DetectionResult schema

FieldTypeDescription
redactionsarray of RedactionEntityDtoDetected entities. The same set is encoded in the FDF file in outputFiles.

RedactionEntityDto schema

FieldTypeDescription
pageIndexintegerZero-based page index where the entity appears.
textstringThe matched text. Required.
labelstringThe entity type, for example EMAIL_ADDRESS or PERSON. Required.
scorenumberConfidence score between 0 and 1.
quadrilateralsarray of QuadrilateralDtoOn-page bounding regions for the entity.

QuadrilateralDto schema

All four corners are required. Quadrilaterals can describe rotated or skewed text regions, not only axis-aligned rectangles.

FieldTypeDescription
bottomLeftPointDtoBottom-left corner.
bottomRightPointDtoBottom-right corner.
topRightPointDtoTop-right corner.
topLeftPointDtoTop-left corner.

PointDto schema

FieldTypeDescription
xnumberX coordinate in PDF user-space points.
ynumberY coordinate in PDF user-space points.

FileResult schema

FieldTypeDescription
fileIdstring (UUID)Identifier of the result file.
fileCodestringRole of the file in the job result.
sizeintegerFile size in bytes.
dekTokenstringDEK token required to download or use the result file.
dekTokenExpiresAtstring (UTC date-time)Expiration of the DEK token.

JobStatus enum

ValueDescription
inProgressThe job is running.
finishedThe job completed successfully. The result is available in outputFiles and result.
errorThe job failed. The cause is available in error.

JobType enum

ValueDescription
detectionA detection job.
redactionA redaction job.

ProcessingMode enum

ValueDescription
syncThe request blocks until the job completes.
asyncThe request returns immediately with a jobId. Poll the result endpoint until jobStatus is finished or error.