Skip to main content
Version: Version 1.0.0

[ObjectsExtractionParams] INI file section

The [ObjectsExtractionParams] INI file section controls how the Pdftools OCR Service extracts, filters, and detects various visual objects and text elements from scanned images.


Common settings

FastObjectsExtraction

KeyTypeDefault
FastObjectsExtractionBooleanfalse

If this property is set to true, object extraction will speed up, but quality may deteriorate.


ProhibitColorImage

KeyTypeDefault
ProhibitColorImageBooleanfalse

If set to true, the Pdftools OCR Service will use only a black-and-white plane during object extraction.
Detection quality for colored tables and images may be reduced.


Removing objects

RemoveGarbage

KeyTypeDefault
RemoveGarbageBooleanfalse

Specifies whether to remove “garbage” (for example, small dots below a certain size) from the image during object extraction.


RemoveTexture

KeyTypeDefault
RemoveTextureBooleantrue

If set to true, the Pdftools OCR Service removes background texture noise from a temporary image used for recognition.
The source image itself remains unchanged.


Detecting objects

DetectMatrixPrinter

KeyTypeDefault
DetectMatrixPrinterBooleantrue

If this property is set to true, text printed using a matrix printer is detected during objects extraction.


DetectPorousText

KeyTypeDefault
DetectPorousTextBooleantrue

If set to true, regions with porous text are detected during objects extraction.


DetectTextOnPictures

KeyTypeDefault
DetectTextOnPicturesBooleanfalse

When this property is set to true, the Pdftools OCR Service will detect all text on an image, including text embedded in pictures. The reading order is not changed, enabling full-text search later.


EnableAggressiveTextExtraction

KeyTypeDefault
EnableAggressiveTextExtractionBooleanfalse

If set to true, the Pdftools OCR Service will attempt to extract as much text as possible, even from low-quality images.
Recommended when the input contains degraded or faint text.

warning

The EnableAggressiveTextExtraction mode may lead to misinterpreting pictures as text or misordering horizontal text vertically.


ProhibitDottedSeparators

KeyTypeDefault
ProhibitDottedSeparatorsBooleanfalse

If this property is set to true, Pdftools OCR Service presumes that the document does not contain dotted separators.
This can be useful if you’re certain the document lacks dotted separators or if some content is mistakenly identified as one.