3-Heights™ OCR Enterprise Add-On – plug optical character recognition into a PDF tool

The 3-Heights™ OCR Enterprise Add-On compliments several products of PDF Tools AG with a high performance optical character recognition (OCR) function. This allows for converting images such as TIFF or JPEG to PDF or PDF/A, or converting PDF to PDF/A and applying OCR at the same time.

The customer has a free choice of the OCR Engine he wants to use. At this time ABBYY FineReader is available in different types of licensing models. Depending on the requirements for recognition rate, throughput and costs, an adequate model can be selected.

PDF OCR Enterprise - functions

  • General functions
    • Add OCR text information to PDF documents
    • Set the language of the OCR text to increase the recognition rate
    • Direct access to the OCR engine or synchronized use via a service
    • Recognition of multi language documents
  • ABBY related functions
    • Recognition of almost 200 languages with machine generated contents
    • Extended support of almost 50 languages with dictionaries and morphological tools
    • Recognition of typewriter scripts
    • Recognition and decoding of barcodes (1D)
    • Recognition of type of content (images vs. texts)
    • Modules to support additional languages
    • Chinese, Japanese, Korean
    • Old European Languages
    • 2D Barcode
    • Select normal, fast and balanced mode
    • De-Skewing: Automatic image alignment
    • Image clean-up: Unwanted artifacts are recognized and eliminated
    • Filtering of non-relevant backgrounds
    • Recognition and correction of page orientation
    • Creation and use of profiles that summarize the above features
Functionality graphic 3-Heights™ OCR Enterprise Add-On

Supported formats

Input formats

Defined by base product:

  • 3-Heights™ PDF to PDF/A Converter
    • PDF
  • 3-Heights™ Image to PDF Converter
    • TIFF (Tagged Image File Format)
    • JPEG (Joint Photographic Expert Group)
    • PNG (Portable Network Graphics)
    • GIF (Graphics Interchange Format)
    • BMP (Window Bitmap)
    • EPS (Encapsulated Post Script)
    • JB2 (JBIG2, Joint Bi-level Image Experts Group)
    • JP2 (JPEG2000)
    • JPX (Extended JPEG2000)
    • PBM (Portable Bitmap File Format)
    • JIF (GIF Flate)
  • 3-Heights™ Document Converter
    • Microsoft Office 2003 and 2007 documents
    • Document of older Microsoft Office versions
    • Simple Text
    • WordPerfect
    • HTML
    • Outlook (MSG)
    • PDF
    • Internet Mail Message Format
    • Image formats (TIFF, JPEG, PNG, JBIG2, JPX, GIF, BMP, etc.)
    • ZIP and TAR archives
    • Add-ins for customer specific formats

Output formats

  • PDF, PDF/A

Required base products

The 3-Heights™ OCR Enterprise Add-On can be used in conjunction with the following products:

Magnifying lens for our PDF manuals and PDF sample code

MANUAL

Add-On

Area of use

Inbox

Recognition of texts while scanning incoming mail. Usage of texts in the metadata of incoming documents and in the downstream business processes, for example ERP and Workflow Systems. Direct archiving of incoming documents with text recognition. Text recognition in scanned email attachments for easier processing.

Archiving

Apply text recognition when converting archives from TIFF or PDF to PDF/A. Convert proprietary formats to PDF/A and embed texts. Recognize information on index pages and transmit them to the metadata of the document or dossier.

Further areas of use

  • Unpacking scanned email attachments
  • Preparing for archiving
  • Archive migration

Bayer CropScience relies on the ISO long-term archiving format PDF/A

The change to PDF/A has enabled the customer to profit from various advantages: the PDF documents, in contrast to TIFF pages, are searchable and the text can be reused through copy & paste. In addition, the tables of contents have retained their links, allowing users to quickly navigate through documents.

PDF/A conversion with OCR recognition for Volkswagen Foundation’s document management

By integrating the 3-Heights™ components, the Volkswagen Foundation achieved a standardization of the different PDF variants in their DMS. As well, the conversion of different image formats into full-text indexed PDF documents is possible.

ABBYY

The ABBYY FineReader Engine is available in three different versions for Windows. Version 8 is no longer sold:

  • ABBYY FineReader Engine 10
  • ABBYY FineReader Enging 11

The engine can be downloaded from the Resources page or your personal account. The license key, which is required to activate the engine in evaluation mode or production mode, can only be downloaded from the personal download account. If you require a license key, please contact us.

Other platforms are available on request.