3-Heights® Scan to PDF Server - convert scanned documents into PDF/A

Scanning paper documents has become a daily ritual in the mail receiving room of many businesses. This task is often performed by a third-party scanning service provider. In most cases the scanned images are saved as black & white TIFF files, the format synonymous with faxes. In special cases, for example checks, identification papers with photos etc., the documents are scanned to color files.

One must be cautious, however, since colored TIFF files can quickly become extremely large. The PDF/A standard has now also established itself in incoming mail applications, especially when dealing with color scans. However, individual processing steps like text recognition, compression and digital signatures are generally not optimized with one another or integrated into one single solution. There are, for example, scanners that can create PDF/A files and also sign them. However, the subsequent compression of the file invalidates the digital signature, making it worthless.

PDF Tools AG offers a solution for creating PDF/A files from scanned documents and fax images that fulfills all the vital requirements like small file size, searchable files and embedded metadata. The following diagram illustrates the principle.

Automation

Create files from scan and sign them

Text recognition

Make scanned documents searchable (OCR)

Enterprise Application

Central service for PDF/A document creation

logo

Audit-compliant archiving of creditor invoices at KIBAG Dienstleistungen AG

KIBAG Dienstleistungen AG had been scanning creditor invoices directly into ABACUS since 2014. The invoices were scanned in batches and archived in paper form as well. PDF Tools worked with its partner QuoVadis Trustlink Schweiz AG to find the best solution.
logo

Centralized document capturing with Scan2SAP solution

In the past, Oerlikon set up scanning stations with special software in various departments around the world to scan and digitize a range of documents containing business-relevant information. The collection and further processing required a lot of administrative effort within the individual organizations.
Product illustration 3-Heights® Scan to PDF Server

Scan to PDF Server – Features

  • Features
    • Conversion of single page or multi-page raster images to PDF
    • Processing of subfolders
    • Flexible workflow configuration
    • Set output format and conformity level (PDF, PDF/A-1, PDF/A-2 and PDF/A-3)
    • Optical character recognition (OCR) including barcodes
    • Digital PDF signature
    • Parallel processing
  • Compression
    • Set image compression individually different classes of images
    • Support for mixed raster content (MRC)
    • CCITT Group3 (1D and 2D)
    • CCITT Group4
    • LZW
    • JPEG
    • Deflate (ZIP)
    • JPEG2000
    • JBIG2 (lossless only)

Conformance

  • ISO 32000-1 (PDF 1.7)
  • ISO 32000-2 (PDF 2.0)
  • ISO 19005-1 (PDF/A-1)
  • ISO 19005-2 (PDF/A-2)
  • ISO 19005-3 (PDF/A-3)
Powered by 3‑Heights® TechnologyPDF/A compliant

Supported formats

Input Image Formats

  • JPEG
  • TIFF
  • scanned PDF

Output Formats

  • PDF 1.0 to 1.7
  • PDF 2.0
  • PDF/A-1, PDF/A-2, PDF/A-3

MANUALS

Service

Areas of use - create PDF/A files from scanned documents

Paper capture

Electronic archiving of paper documents received as incoming mail within a company.

Facsimile capture

Electronic archiving of all fax transactions between the company and its business partners.

Archive migration

Migration of paper archives to an electronic archive with the standardized PDF/A format.

Web/mobile capture

Use of the central service in client/server applications via a web service.

Enterprise application integration

Use of the central service for PDF/A document creation via a programming interface (API) from specialist applications that create TIFF or JPEG files.

Contact us

Distributed architecture and scalability

The 3-Heights® Scan to PDF Server is a scalable and freely configurable service. The service accesses a separate program for each work stage, such as compression, OCR recognition, conversion into PDF/A, etc. It receives the result of the previous work stage as its input and makes the output available for the next work stage. The work stages are linked by means of an XML configuration file. This architecture allows the work stages of the service to be structured in a highly flexible way, and enables almost any number of extension possibilities by adding additional work stages.

To increase the level of parallel processing, the documents can be broken down into individual pages and sent through the processing stages simultaneously, after which they are then merged back into a single document. This option can improve the use of computer resources considerably (processor cores, memory, input and output, OCR engine, etc.).

PDF/A Advantages

  • Standardized format

PDF/A is suitable for storing both scanned and digitally created documents.

  • High compression rate

The PDF/A standard supports more modern and powerful compression processes, and thus small file sizes for color images.

  • Text recognition

The created PDF/A documents can be made searchable by embedding text from an OCR engine.

  • Embedded metadata

In order for the document and the associated metadata to form an inseparable whole, the metadata is embedded in the file in PDF/A. For saving, PDF/A uses the Extensible Metadata Platform (XMP) format, which, like PDF/A, is also defined as its own ISO standard.

  • Digital signature

In order to ensure the integrity and authenticity of the created documents, a digital signature can be applied to the PDF/A document in accordance with the PAdES standard. The digital signature is a kind of electronic signature that can serve the same purpose as a handwritten signature, provided that the corresponding legal requirements (national signature laws) are met.

Advantages of PDF/A over TIFF

In principle, TIFF documents offer all these advantages, but only as proprietary extensions, since the TIFF standard itself does not offer solutions.

Requirements TIFF PDF/A
Long-term readability + +
Clear rendering + +
Data consistency Proprietary tags for metadata +
Authenticity / Integrity With detached signatures +
Required storage space

Black / White: +

Color: -

+
Searchability Proprietary tags for OCR text +
Long-term experience + +

Usually, the individual processing stages, such as text recognition, compression, PDF/A generation and digital signature, cannot be performed by the scanner alone, as metadata is often added retroactively by an index station. However, this work stage breaks the seal of the digital signature and makes it worthless. Here, too, separate software can offer a decisive advantage.