[ObjectsExtractionParams] INI file section
The [ObjectsExtractionParams]
INI file section controls how the Pdftools OCR Service extracts, filters, and detects various visual objects and text elements from scanned images.
Common settings
FastObjectsExtraction
Key | Type | Default |
---|---|---|
FastObjectsExtraction | Boolean | false |
If this property is set to true
, object extraction will speed up, but quality may deteriorate.
ProhibitColorImage
Key | Type | Default |
---|---|---|
ProhibitColorImage | Boolean | false |
If set to true
, the Pdftools OCR Service will use only a black-and-white plane during object extraction.
Detection quality for colored tables and images may be reduced.
Removing objects
RemoveGarbage
Key | Type | Default |
---|---|---|
RemoveGarbage | Boolean | false |
Specifies whether to remove “garbage” (for example, small dots below a certain size) from the image during object extraction.
RemoveTexture
Key | Type | Default |
---|---|---|
RemoveTexture | Boolean | true |
If set to true
, the Pdftools OCR Service removes background texture noise from a temporary image used for recognition.
The source image itself remains unchanged.
Detecting objects
DetectMatrixPrinter
Key | Type | Default |
---|---|---|
DetectMatrixPrinter | Boolean | true |
If this property is set to true
, text printed using a matrix printer is detected during objects extraction.
DetectPorousText
Key | Type | Default |
---|---|---|
DetectPorousText | Boolean | true |
If set to true
, regions with porous text are detected during objects extraction.
DetectTextOnPictures
Key | Type | Default |
---|---|---|
DetectTextOnPictures | Boolean | false |
When this property is set to true
, the Pdftools OCR Service will detect all text on an image, including text embedded in pictures. The reading order is not changed, enabling full-text search later.
EnableAggressiveTextExtraction
Key | Type | Default |
---|---|---|
EnableAggressiveTextExtraction | Boolean | false |
If set to true
, the Pdftools OCR Service will attempt to extract as much text as possible, even from low-quality images.
Recommended when the input contains degraded or faint text.
The EnableAggressiveTextExtraction
mode may lead to misinterpreting pictures as text or misordering horizontal text vertically.
ProhibitDottedSeparators
Key | Type | Default |
---|---|---|
ProhibitDottedSeparators | Boolean | false |
If this property is set to true
, Pdftools OCR Service presumes that the document does not contain dotted separators.
This can be useful if you’re certain the document lacks dotted separators or if some content is mistakenly identified as one.