Skip to main content
Version: Version 1.1.2

Supported languages

The Pdftools OCR Service only recognizes characters that are used in the languages you configured. For example, if you only choose English, some special characters like German umlauts (äöü) aren’t recognized correctly and may appear as different letters (aou). Choose the languages used in the documents you OCR to avoid such mistakes.

The Pdftools OCR Service supports the following languages:

Additional language support

Before you use Arabic, Chinese, Hebrew, Japanese, Korean, or Thai in OCR analysis, contact us through the Contact page for exact pricing information. OCR analysis of these languages has additional costs.

Natural languages

Jump to: | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |

A

  • Abkhaz – Abkhaz
  • Adyghe – Adyghe
  • Afrikaans – Afrikaans
  • Agul – Agul
  • Albanian – Albanian
  • Altaic – Altaic
  • Arabic – Arabic (Saudi Arabia)1
  • ArmenianEastern – Armenian (Eastern)
  • ArmenianGrabar – Armenian (Grabar)
  • ArmenianWestern – Armenian (Western)
  • Awar – Avar
  • Aymara – Aymara
  • AzeriCyrillic – Azerbaijani (Cyrillic)
  • AzeriLatin – Azerbaijani (Latin)

B

  • Bashkir – Bashkir
  • Basque – Basque
  • Belarusian – Belarussian
  • Bemba – Bemba
  • Blackfoot – Blackfoot
  • Breton – Breton
  • Bugotu – Bugotu
  • Bulgarian – Bulgarian
  • Buryat – Buryat

C

  • Catalan – Catalan
  • Chamorro – Chamorro
  • Chechen – Chechen
  • ChinesePRC – Chinese Simplified1
  • ChineseTaiwan – Chinese Traditional1
  • Chukcha – Chukcha
  • Chuvash – Chuvash
  • Corsican – Corsican
  • CrimeanTatar – Crimean Tatar
  • Croatian – Croatian
  • Crow – Crow
  • Czech – Czech

D

  • Danish – Danish
  • Dargwa – Dargwa
  • Dungan – Dungan
  • Dutch – Dutch (Netherlands)
  • DutchBelgian – Dutch (Belgium)

E

  • English – English
  • EskimoCyrillic – Eskimo (Cyrillic)
  • EskimoLatin – Eskimo (Latin)
  • Esperanto – Esperanto
  • Estonian – Estonian
  • Even – Even
  • Evenki – Evenki

F

  • Faeroese – Faeroese
  • Fijian – Fijian
  • Finnish – Finnish
  • French – French
  • Frisian – Frisian
  • Friulian – Friulian

G

  • GaelicScottish – Scottish Gaelic
  • Gagauz – Gagauz
  • Galician – Galician
  • Ganda – Ganda
  • German – German
  • GermanLuxembourg – German (Luxembourg)
  • GermanNewSpelling – German (new spelling)
  • Greek – Greek
  • Guarani – Guarani

H

  • Hani – Hani
  • Hausa – Hausa
  • Hawaiian – Hawaiian
  • Hebrew – Hebrew1
  • Hungarian – Hungarian

I

  • Icelandic – Icelandic
  • Ido – Ido
  • Indonesian – Indonesian
  • Ingush – Ingush
  • Interlingua – Interlingua
  • Irish – Irish
  • Italian – Italian

J

  • Japanese – Japanese1
  • JapaneseModern – Japanese (Modern)1

K

  • Kabardian – Kabardian
  • Kalmyk – Kalmyk
  • KarachayBalkar – Karachay-Balkar
  • Karakalpak – Karakalpak
  • Kasub – Kasub
  • Kawa – Kawa
  • Kazakh – Kazakh
  • Khakas – Khakas
  • Khanty – Khanty
  • Kikuyu – Kikuyu
  • Kirgiz – Kirghiz
  • Kongo – Kongo
  • Korean – Korean1
  • KoreanHangul – Korean (Hangul)1
  • Koryak – Koryak
  • Kpelle – Kpelle
  • Kumyk – Kumyk
  • Kurdish – Kurdish

L

  • Lak – Lak
  • Lappish – Sami (Lappish)
  • Latin – Latin
  • Latvian – Latvian
  • Lezgin – Lezgin
  • Lithuanian – Lithuanian
  • Luba – Luba

M

  • Macedonian – Macedonian
  • Malagasy – Malagasy
  • Malay – Malay
  • Malinke – Malinke
  • Maltese – Maltese
  • Mansi – Mansi
  • Maori – Maori
  • Mari – Mari
  • Maya – Maya
  • Miao – Miao
  • Minankabaw – Minangkabau
  • Mohawk – Mohawk
  • Mongol – Mongol
  • Mordvin – Mordvin

N

  • Nahuatl – Nahuatl
  • Nenets – Nenets
  • Nivkh – Nivkh
  • Nogay – Nogay
  • Norwegian – Norwegian Nynorsk and Norwegian Bokmal
  • NorwegianBokmal – Norwegian (Bokmal)
  • NorwegianNynorsk – Norwegian (Nynorsk)
  • Nyanja – Nyanja

O

  • Occidental – Occidental
  • Ojibway – Ojibway
  • Ossetic – Ossetian

P

  • Papiamento – Papiamento
  • PidginEnglish – Tok Pisin
  • Polish – Polish
  • PortugueseBrazilian – Portuguese (Brazil)
  • PortugueseStandard – Portuguese (Portugal)
  • Provencal – Provencal

Q

  • Quechua – Quechua

R

  • RhaetoRomanic – Rhaeto-Romanic
  • Romanian – Romanian
  • RomanianMoldavia – Romanian (Moldavia)
  • Romany – Romany
  • Ruanda – Ruanda
  • Rundi – Rundi
  • RussianOldSpelling – Russian (old spelling)
  • Russian – Russian
  • RussianWithAccent – Russian with accents marking stress position

S

  • Samoan – Samoan
  • Selkup – Selkup
  • SerbianCyrillic – Serbian (Cyrillic)
  • SerbianLatin – Serbian (Latin)
  • Shona – Shona
  • Sioux – Sioux (Dakota)
  • Slovak – Slovak
  • Slovenian – Slovenian
  • Somali – Somali
  • Sorbian – Sorbian
  • Sotho – Sotho
  • Spanish – Spanish
  • Sunda – Sunda
  • Swahili – Swahili
  • Swazi – Swazi
  • Swedish – Swedish

T

  • Tabassaran – Tabassaran
  • Tagalog – Tagalog
  • Tahitian – Tahitian
  • Tajik – Tajik
  • Tatar – Tatar
  • Thai – Thai1
  • Tinpo – Jingpo
  • Tongan – Tongan
  • Tswana – Tswana
  • Tun – Tun
  • Turkish – Turkish
  • Turkmen – Turkmen
  • TurkmenLatin – Turkmen (Latin)
  • Tuvin – Tuvan

U

  • Udmurt – Udmurt
  • UighurCyrillic – Uighur (Cyrillic)
  • UighurLatin – Uighur (Latin)
  • Ukrainian – Ukrainian
  • UzbekCyrillic – Uzbek (Cyrillic)
  • UzbekLatin – Uzbek (Latin)

V

  • Visayan – Cebuano

W

  • Welsh – Welsh
  • Wolof – Wolof

X

  • Xhosa – Xhosa

Y

  • Yakut – Yakut

Z

  • Zapotec – Zapotec
  • Zulu – Zulu

Technical languages

  • Basic – Basic programming language
  • C++ – C/C++ programming language
  • Chemistry – Simple chemical formulas
  • Digits – Numbers
  • CMC7 – For MICR (CMC-7) text type
  • Cobol – Cobol programming language
  • E13B – For MICR (E-13B) text type
  • Fortran – Fortran programming language
  • Java – Java programming language
  • OcrA – For OCR-A text type
  • OcrB – For OCR-B text type
  • Pascal – Pascal programming language

Footnotes

  1. Before you use Arabic, Chinese, Hebrew, Japanese, Korean, or Thai in OCR analysis, contact us through the Contact page for exact pricing information. OCR analysis of these languages has additional costs. 2 3 4 5 6 7 8 9