Convert scanned documents to PDF/A
Text / Barcode recognition | Compression | Embed metadata | PDF/A creation | Digital signature | Validation
Scanning paper documents has become a daily ritual in the mail receiving room of many businesses. This task is often performed by a third-party scanning service provider. In most cases the scanned images are saved as black & white TIFF files, the format synonymous with faxes. In special cases, for example checks, identification papers with photos etc., the documents are scanned to color files.
One must be cautious, however, since colored TIFF files can quickly become extremely large. The PDF/A standard has now also established itself in incoming mail applications, especially when dealing with color scans. However, individual processing steps like text recognition, compression and digital signatures are generally not optimized with one another or integrated into one single solution. There are, for example, scanners that can create PDF/A files and also sign them. However, the subsequent compression of the file invalidates the digital signature, making it worthless.
PDF Tools AG offers a solution for creating PDF/A files from scanned documents and fax images that fulfills all the vital requirements like small file size, searchable files and embedded metadata. The following diagram illustrates the principle.
Create files from scan and sign them
Make scanned documents searchable (OCR)
Central service for PDF/A document creation
Input Image Formats
The 3-Heights™ Scan to PDF Server is a scalable and freely configurable service. The service accesses a separate program for each work stage, such as compression, OCR recognition, conversion into PDF/A, etc. It receives the result of the previous work stage as its input and makes the output available for the next work stage. The work stages are linked by means of an XML configuration file. This architecture allows the work stages of the service to be structured in a highly flexible way, and enables almost any number of extension possibilities by adding additional work stages.
To increase the level of parallel processing, the documents can be broken down into individual pages and sent through the processing stages simultaneously, after which they are then merged back into a single document. This option can improve the use of computer resources considerably (processor cores, memory, input and output, OCR engine, etc.).
Thanks to the signature and validation service and the integration of the 3-Heights™ Scan to PDF Server, KIBAG completely replaced its manual document processing procedure with an audit-conform electronic archiving system. The process is now more manageable and can be structured more effectively. By automating the process, the manual workload was reduced considerably, thus providing more capacity for other urgent tasks.
In principle, TIFF documents offer all these advantages, but only as proprietary extensions, since the TIFF standard itself does not offer solutions.
|Data consistency||Proprietary tags for metadata||+|
|Authenticity / Integrity||With detached signatures||+|
|Required storage space||Black / White: +|
|Searchability||Proprietary tags for OCR text||+|
Usually, the individual processing stages, such as text recognition, compression, PDF/A generation and digital signature, cannot be performed by the scanner alone, as metadata is often added retroactively by an index station. However, this work stage breaks the seal of the digital signature and makes it worthless. Here, too, separate software can offer a decisive advantage.