Skip to main content
Version: Version 1.0.0

[SynthesisParamsForPage] INI file section

The [SynthesisParamsForPage] INI file section defines page-level synthesis settings for paragraph structure, font formatting, and color detection.


Main settings

ParagraphExtractionMode

KeyTypeDefault
DetectDocumentStructureParagraphExtractionModeEnumPEM_NormalExtraction

Specifies the mode of paragraph extraction.

ParagraphExtractionModeEnum

  • PEM_NormalExtraction: Normal paragraph extraction.
  • PEM_RoughExtraction: Extracts the minimal number of paragraphs (either one paragraph per block or only paragraphs that start with a dropped capital).
  • PEM_SingleLineParagraphsWithSpaceFormatting: This constant is deprecated and will be removed in future versions. Each line is extracted to a separate paragraph formatted with spaces.
  • PEM_SingleLineParagraphsWithWordSeparationOnly: Each line is extracted to a separate paragraph without space formatting, and blank spaces are used to separate words only.

DetectFontFormattingAtPageLevel

KeyTypeDefault
DetectDocumentStructureBooleanfalse

If this property is set to true, font parameters are detected at the page synthesis stage. When this property is set to true, it enables the detection of subscripts, superscripts, italic-face type, small capital letters at the stage of page synthesis, and allows you to set additional parameters using [FontFormattingDetectionParams]. If this property is false, the [FontFormattingDetectionParams] is ignored.

info

With the default settings, Pdftools OCR Service detects font parameters at the document synthesis stage. If you set the value of this property to true, you must turn off detection of font parameters during document synthesis.

To do this, set the DetectFontFormatting property to false. Detection of font parameters during page synthesis enables the program to speed up the subsequent document synthesis and decrease memory usage. However, the quality of font detection may deteriorate.


Color settings

DetectBackgroundColor

KeyTypeDefault
DetectBackgroundColorThreeStatePropertyValueEnumTSPV_Auto

If this property is set to TSPV_Yes, the background color is detected during page synthesis.

ThreeStatePropertyValueEnum

  • TSPV_Auto: Automatically determine if this processing mode should be used, depending on the situation (image characteristics, etc.).
  • TSPV_No: The processing mode in question will not be used.
  • TSPV_Yes: The processing mode in question will be used.

AllowGrayBackgroundColor

KeyTypeDefault
AllowGrayBackgroundColorThreeStatePropertyValueEnumTSPV_Auto

If this property is set to TSPV_Yes, the gray color is detected for the background. Otherwise, the background will be detected as black or white. The value of this property is taken into account only if the DetectBackgroundColor property is set to TSPV_Yes or TSPV_Auto


DetectTextColor

KeyTypeDefault
DetectTextColorThreeStatePropertyValueEnumTSPV_Auto

If this property is set to TSPV_Yes, the text color is detected during page synthesis.


CorrectDynamicRange

KeyTypeDefault
CorrectDynamicRangeThreeStatePropertyValueEnumTSPV_Auto

If this property is set to TSPV_Yes, image colors are corrected so that the background is white and the text is black, or vice versa, improving the image quality. Recognition, however, will slow down.

We recommend using this property only if the DetectBackgroundColor and DetectTextColor properties are set to TSPV_Yes or TSPV_Auto.