[PageAnalysisParams] INI file section
The [PageAnalysisParams] INI file section defines parameters controlling how the Pdftools OCR Service analyzes page content during layout analysis, including detecting text, tables, images, barcodes, and layout structures.
DetectText
| Key | Type | Default |
|---|---|---|
DetectText | Boolean | true |
If this property is true, the text areas are detected during layout analysis.
EnableTextExtractionMode
| Key | Type | Default |
|---|---|---|
EnableTextExtractionMode | Boolean | false |
When set to true, the Pdftools OCR Service assumes that the text blocks can be located anywhere on the page.
Isolated text blocks are detected during layout analysis. Tables are not detected.
Model analysis is not performed, as if the ProhibitModelAnalysis property was set to true.
DetectTables
| Key | Type | Default |
|---|---|---|
DetectTables | Boolean | true |
If this property is true, the tables are detected during layout analysis.
AggressiveTableDetection
| Key | Type | Default |
|---|---|---|
AggressiveTableDetection | Boolean | false |
This property manages the table detection mode. If you set it to true, the Pdftools OCR Service tries to find as many tables as possible on the page.
This setting is recommended only for documents that contain a lot of tables.
DetectBarcodes
| Key | Type | Default |
|---|---|---|
DetectBarcodes | Boolean | false |
Specifies if barcodes are detected, and accordingly, barcode blocks are created during layout analysis.
If this property is false, barcodes may be detected as blocks of some other type (for example pictures).
DetectSeparators
| Key | Type | Default |
|---|---|---|
DetectSeparators | Boolean | true |
If this property is true, the separators are detected during layout analysis.
DetectPictures
| Key | Type | Default |
|---|---|---|
DetectPictures | Boolean | true |
If this property is true, the pictures are detected during layout analysis.
DetectVectorGraphics
| Key | Type | Default |
|---|---|---|
DetectVectorGraphics | Boolean | true |
If this property is true, vector pictures are detected during layout analysis.
Vector picture blocks may appear in the layout only if this property was set to true during layout analysis.
Additional settings
DetectMultipleBusinessCards
| Key | Type | Default |
|---|---|---|
DetectMultipleBusinessCards | Boolean | false |
Specifies whether a processing page can contain several business cards.
NoShadowsMode
| Key | Type | Default |
|---|---|---|
NoShadowsMode | Boolean | false |
When set to true, the Pdftools OCR Service presumes that an image has no shadows from scanning.
DetectVerticalEuropeanText
| Key | Type | Default |
|---|---|---|
DetectVerticalEuropeanText | Boolean | false |
When set to true, the Pdftools OCR Service looks for vertically oriented text.
It applies to all languages other than CJK.
For CJK languages, vertical text detection is managed by the ProhibitCJKColumns property.
ProhibitCJKColumns
| Key | Type | Default |
|---|---|---|
ProhibitCJKColumns | Boolean | false |
The text in CJK languages can be written vertically as well as horizontally.
Setting this property to true sets the Pdftools OCR Service to ignore the possibility of vertical text and recognize the image with the assumption that all text is arranged horizontally.
This property is valid only for CJK languages.
ProhibitDoublePageMode
| Key | Type | Default |
|---|---|---|
ProhibitDoublePageMode | Boolean | false |
When set to true, the Pdftools OCR Service presumes that an image is not a double-page book.
ProhibitModelAnalysis
| Key | Type | Default |
|---|---|---|
ProhibitModelAnalysis | Boolean | false |
If this property is false, typical variants of page layout will be evaluated during page analysis, and the best variant will be selected to improve recognition quality.
If the best variant cannot be selected, standard page layout analysis will be performed.
If EnableTextExtractionMode is set to true, this property is ignored and model analysis is not performed.