All features and tool possibilities at a glance
Short facts
Conformance
ISO 32000-1 (PDF 1.7)
ISO 32000-2 (PDF 2.0)
ISO 19005-1 (PDF/A-1)
ISO 19005-2 (PDF/A-2)
ISO 19005-3 (PDF/A-3)
Supported formats
PDF 1.0 to 1.7
PDF 2.0
PDF/A-1, PDF/A-2, PDF/A-3
Features
Extract text
Configure word boundary detection, with word by word
Retrieve text attributes such as position, font and font size
Automatically apply correct character decoding and produce Unicode output
Extract raw character codes
Extract graphics objects (paths)
Extract as strings that contain PDF graphics operators
Convert extracted paths to images
Extract and store images
Retrieve image attributes such as compression format, position, and transparency masks
Extract and store transparency masks
Extract and store alternate images
Extract PDF document-level information
Page count
PDF version
Page labels
Creation and modification date
Document information such as title, author, subjects, and more
Outlines (bookmarks), including destinations
Extract page information
Media box, crop box, trim box, bleed box, and art box
Page rotation
Annotations
Additional features
Extract and store embedded font files
Retrieve detailed font information
Retrieve optional content group (OCG) information and visibility (layers)
Retrieve detailed graphic state information for each extracted page content object
Extract raw PDF objects
Extract document parts for PDF/X or PDF 2.0
Retrieve detailed color space information including lookup tables for indexed color spaces
Extract and store embedded files
Specify a password to decrypt PDF files