Explanation of terms and expressions that are used within PDF Tools.
Do not hesitate to contact us if there is a topic you are missing.
Associates an object, for example a memo, a piece of music or a film, with a position on the page, or represents an opportunity to interact with the user with the help of mouse and keyboard.
Many PDF documents are designed in a way that does not allow the user to change them but to interact nonetheless through the use of form fields and checkboxes.
Distortion, or aliasing, may occur at the edges of an object depending on the image's resolution.
Anti-aliasing methods can be used to minimize this effect. The edges are smoothed out with adjusted color values via retroactive filtering.
A one-dimensional collection of sequential objects with implicit numbering starting at 0.
The American Standard Code for Information Interchange, a widely used convention for the binary encoding of a specific set of 128 characters. The ASCII character set contains the space character (or blank) and the following characters:
"#$%& '()*+,-. /0123456 789:;<=> ?@ABCDEF GHIJKLMN OPQRSTUV WXYZ[\]^ _`abcdef ghijklmn opqrstuv wxyz~
An ordered sequence of bytes. Images and fonts are examples of objects stored as binary data.
Either the keyword true or the keyword false.
A group of 8 binary digits (8 bit), which collectively can represent one of 256 different values. These 8 binary digits are used in a multitude of today's electronic devices.
The primary dictionary object that contains the direct or indirect references to all other objects in the document with the exception of the trailer which the catalog does not reference.
A byte whose value is usually interpreted as a symbol within a symbol set with 256 or fewer members. Character examples: 1, 2, a, b, A, &, etc.
A defined set of symbols, whereby a unique byte value is assigned to each character. Character examples:
Software application that is both a conforming reader and a conforming writer.
Software application that can read and edit a PDF file that conforms to a specification, for instance [ISO 32000] or [ISO 19005-1], and that is compliant with the requirements of a conforming reader.
Software application able to write PDF files that conform to a specification such as [ISO 32000] or [ISO 19005-1].
A datastream object whose data consists of a sequence of instructions that describe the graphic elements of a page.
Corrupt PDF file
A PDF document that is not correct and may therefore be unreadable. Possible causes include:
- The document was not generated correctly
- The document was damaged after its creation (e.g. incomplete copying process)
Data structure containing the byte offset start for all of the file's indirect objects.
An associative table of object pairs; the first object is the object name and functions as the key, the second object is the value and can be any type of object, including another dictionary.
Any object that has not been made an indirect object.
An electronic representation of a page-oriented compilation of text, images and graphic data, as well as metadata that helps to identify, understand and display the data. Electronic documents can be reproduced on paper or displayed on screen without any significant loss of information.
End–of–line marker (EOL marker)
A sequence of one or two characters marking the end of a line and consisting of
- a CARRIAGE RETURN character (U+000D)
- or a LINE FEED character (U+000A)
- or a CARRIAGE RETURN followed directly by a LINE FEED.
An optional component of a datastream specification that defines how datastream data should be decoded before it is used. Filter examples: Flate, DCT.
Identifies collections of graphics that can be glyphs or other graphic elements [ISO 15930-4].
A font file defines how glyphs are displayed. If a font file is contained in a PDF file then the associated font is embedded in the file.
If the font does not contain a complete character set but, for example, only the glyphs of the characters used in the document, the term used is subsetted font.
A special type of object representing a parameterized class, including mathematical formulae and sampled representations of arbitrary resolution.
A filter that can minimize image noise by smoothing or applying a soft-focus effect during the image editing process.
Recognizable abstract graphic symbol, independent of any specific design [ISO/IEC 9541-1]. Glyph examples of the character "A" include: A, A, A
The uppermost element of a memory stack contains the parameters that control graphic representation. The graphic state contains information such as color, font, font size, current transformation matrix, etc.
Hinting is a method that improves the display quality of fonts by optimizing the outlines when displaying the characters.
Color profile compliant with the ICC specification [ISO 15076-1:2005].
An object designated by a positive integer object followed by a non-negative integer generation number followed by obj and ending with endobj.
Mathematical integer implemented so that 0 forms the center of the interval. The number can have one or more digits and an optional sign.
A method that controls the combination of pixel density and color depth in raster images during editing. Bilinear interpolation is an extension of linear interpolation for scaling and displaying textures in rendered images.
Multiple Master Fonts
Variant of the PostScript Type 1 format, which allows for all conceivable display variations of a specific font. Other elements such as line thickness and proportions can be adjusted alongside the common specifications.
An atomic symbol uniquely defined by a sequence of characters beginning with a forward slash (/, U+002F), whereby the forward slash is not part of the name.
Similar to a dictionary that associates keys and values, whereby the keys in a name tree are strings and are ordered.
A singular object of type null, designated by the keyword null, whose type and value are different to every other object.
Similar to a dictionary that associates keys and values, whereby the keys in a number tree are strings and are ordered.
Either an integer object or a real object.
A basic data structure used to construct PDF files. An object can be of the following types: array, Boolean, dictionary, integer, null, real, datastream or string.
An object value that allows one object to be referenced with another. It has the form "<n> <m> R", where <n> is an indirect object number, <m> is its version number and R is the uppercase letter R.
A datastream containing a sequence of PDF objects.
Optical character recognition (optical character reader, OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into text, whether from a scanned document or a photo of a document.
Portable Document Format file format, defined in [ISO 32000].
Portable Document Format file format for archiving, defined in [ISO 19005]. Describes the requirements PDF documents must fulfill to comply with the standards PDF/A-1a and PDF/A-1b. The basic requirements of PDF/A-1b are:
- Conformity with PDF Version 1.4
- Embedding of all fonts used for visible text
- Embedding of color profiles if specified by the color space used
- No encryption
- No transparency
The following applies additionally to PDF/A-1a:
- Encoding text as UNICODE
- Structural information must exist (tagging)
Approximate mathematical real numbers but with limited range and precision and written as one or more digits with an optional sign and optional decimal point.
A specific array object that defines the position and bounding boxes on a page for various objects. It is represented as an array of 4 numbers designating the coordinate pairs of two diagonally opposed corners, usually in the form [bottom left X, Y, top right X, Y].
Associates resources with names, uses the objects in content datastreams with the resource objects themselves and organizes them in various categories (e.g. font, color space, pattern).
Space character, white-space character
Text character used to represent an orthographic white space. Includes the following characters:
- HORIZONTAL TABULATION (U+0009)
- LINE FEED (U+000A)
- VERTICAL TABULATION (U+000B)
- FORM FEED (U+000C)
- CARRIAGE RETURN (U+000D)
- SPACE (U+0020)
- NOBREAK SPACE (U+00A0)
- EN SPACE (U+2002)
- EM SPACE (U+2003)
- FIGURE SPACE (U+2007)
- PUNCTUATION SPACE (U+2008)
- THIN SPACE (U+2009)
- HAIR SPACE (U+200A)
- ZERO WIDTH SPACE (U+200B)
- IDEOGRAPHIC SPACE (U+3000)
Consists of a dictionary followed by zero or more bytes parenthesized by the keywords stream and endstream.
Consists of a series of bytes (unsigned integer values ranging from 0 to 255). The bytes are not integer objects but are stored in a more compact form.
International standard assigning a unique value to every meaningful font character or text element. The Universal Character Set [ISO 10646] is practically equivalent to all extents and purposes.
Designates the PDF reference used to generate the document. The processing PDF software must support this version to guarantee correct processing. PDF versions range from 1.0 to 1.8 (as per 2009). PDF 1.4 corresponds to Acrobat 5, PDF 1.8 corresponds to Acrobat 9.
Designates the process of generating PDF content by importing and possibly converting files from the Internet or local files. The files can be imported in any format such as HTML, GIF, JPEG, text, and PDF.
Structured wrapper for serialized XML metadata that can be embedded in various file formats.