[PageAnalysisParams] INI file section
The [PageAnalysisParams]
INI file section defines parameters controlling how the Pdftools OCR Service analyzes page content during layout analysis, including detecting text, tables, images, barcodes, and layout structures.
DetectText
Key | Type | Default |
---|---|---|
DetectText | Boolean | true |
If this property is true
, the text areas are detected during layout analysis.
EnableTextExtractionMode
Key | Type | Default |
---|---|---|
EnableTextExtractionMode | Boolean | false |
When set to true
, the Pdftools OCR Service assumes that the text blocks can be located anywhere on the page.
Isolated text blocks are detected during layout analysis. Tables are not detected.
Model analysis is not performed, as if the ProhibitModelAnalysis
property was set to true
.
DetectTables
Key | Type | Default |
---|---|---|
DetectTables | Boolean | true |
If this property is true
, the tables are detected during layout analysis.
AggressiveTableDetection
Key | Type | Default |
---|---|---|
AggressiveTableDetection | Boolean | false |
This property manages the table detection mode. If you set it to true
, the Pdftools OCR Service tries to find as many tables as possible on the page.
This setting is recommended only for documents that contain a lot of tables.
DetectBarcodes
Key | Type | Default |
---|---|---|
DetectBarcodes | Boolean | false |
Specifies if barcodes are detected, and accordingly, barcode blocks are created during layout analysis.
If this property is false
, barcodes may be detected as blocks of some other type (for example pictures).
DetectSeparators
Key | Type | Default |
---|---|---|
DetectSeparators | Boolean | true |
If this property is true
, the separators are detected during layout analysis.
DetectPictures
Key | Type | Default |
---|---|---|
DetectPictures | Boolean | true |
If this property is true
, the pictures are detected during layout analysis.
DetectVectorGraphics
Key | Type | Default |
---|---|---|
DetectVectorGraphics | Boolean | true |
If this property is true
, vector pictures are detected during layout analysis.
Vector picture blocks may appear in the layout only if this property was set to true
during layout analysis.
Additional settings
DetectMultipleBusinessCards
Key | Type | Default |
---|---|---|
DetectMultipleBusinessCards | Boolean | false |
Specifies whether a processing page can contain several business cards.
NoShadowsMode
Key | Type | Default |
---|---|---|
NoShadowsMode | Boolean | false |
When set to true
, the Pdftools OCR Service presumes that an image has no shadows from scanning.
DetectVerticalEuropeanText
Key | Type | Default |
---|---|---|
DetectVerticalEuropeanText | Boolean | false |
When set to true
, the Pdftools OCR Service looks for vertically oriented text.
It applies to all languages other than CJK.
For CJK languages, vertical text detection is managed by the ProhibitCJKColumns
property.
ProhibitCJKColumns
Key | Type | Default |
---|---|---|
ProhibitCJKColumns | Boolean | false |
The text in CJK languages can be written vertically as well as horizontally.
Setting this property to true
sets the Pdftools OCR Service to ignore the possibility of vertical text and recognize the image with the assumption that all text is arranged horizontally.
This property is valid only for CJK languages.
ProhibitDoublePageMode
Key | Type | Default |
---|---|---|
ProhibitDoublePageMode | Boolean | false |
When set to true
, the Pdftools OCR Service presumes that an image is not a double-page book.
ProhibitModelAnalysis
Key | Type | Default |
---|---|---|
ProhibitModelAnalysis | Boolean | false |
If this property is false
, typical variants of page layout will be evaluated during page analysis, and the best variant will be selected to improve recognition quality.
If the best variant cannot be selected, standard page layout analysis will be performed.
If EnableTextExtractionMode
is set to true
, this property is ignored and model analysis is not performed.