Supported languages
The Pdftools OCR Service only recognizes characters that are used in the languages you configured. For example, if you only choose English, some special characters like German umlauts (äöü) are not identified correctly and can be identified as different letters (aou). Choose the languages used in the documents you OCR to avoid such mistakes.
The Pdftools OCR Service supports the following languages:
Natural languages
Jump to: | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
A
- Abkhaz – Abkhaz
- Adyghe – Adyghe
- Afrikaans – Afrikaans
- Agul – Agul
- Albanian – Albanian
- Altaic – Altaic
- Arabic – Arabic (Saudi Arabia)
- ArmenianEastern – Armenian (Eastern)
- ArmenianGrabar – Armenian (Grabar)
- ArmenianWestern – Armenian (Western)
- Awar – Avar
- Aymara – Aymara
- AzeriCyrillic – Azerbaijani (Cyrillic)
- AzeriLatin – Azerbaijani (Latin)
B
- Bashkir – Bashkir
- Basque – Basque
- Belarusian – Belarussian
- Bemba – Bemba
- Blackfoot – Blackfoot
- Breton – Breton
- Bugotu – Bugotu
- Bulgarian – Bulgarian
- Burmese – Burmese
- Buryat – Buryat
C
- Catalan – Catalan
- Chamorro – Chamorro
- Chechen – Chechen
- ChinesePRC – Chinese Simplified
- ChineseTaiwan – Chinese Traditional
- Chukcha – Chukcha
- Chuvash – Chuvash,
- Corsican – Corsican
- CrimeanTatar – Crimean Tatar
- Croatian – Croatian
- Crow – Crow
- Czech – Czech
D
- Danish – Danish
- Dargwa – Dargwa
- Dungan – Dungan
- Dutch – Dutch (Netherlands)
- DutchBelgian – Dutch (Belgium)
E
- English – English
- EskimoCyrillic – Eskimo (Cyrillic)
- EskimoLatin – Eskimo (Latin)
- Esperanto – Esperanto
- Estonian – Estonian
- Even – Even
- Evenki – Evenki
F
- Faeroese – Faeroese
- Farsi – Farsi
- Fijian – Fijian
- Finnish – Finnish
- French – French
- Frisian – Frisian
- Friulian – Friulian
G
- GaelicScottish – Scottish Gaelic
- Gagauz – Gagauz
- Galician – Galician
- Ganda – Ganda
- German – German
- GermanLuxembourg – German (Luxembourg)
- GermanNewSpelling – German (new spelling)
- Greek – Greek
- Guarani – Guarani
H
- Hani – Hani
- Hausa – Hausa
- Hawaiian – Hawaiian
- Hebrew – Hebrew
- Hungarian – Hungarian
I
- Icelandic – Icelandic
- Ido – Ido
- Indonesian – Indonesian
- Ingush – Ingush
- Interlingua – Interlingua
- Irish – Irish
- Italian – Italian
J
- Japanese – Japanese
- JapaneseModern – Japanese (Modern)
K
- Kabardian – Kabardian
- Kalmyk – Kalmyk
- KarachayBalkar – Karachay-Balkar
- Karakalpak – Karakalpak
- Kasub – Kasub
- Kawa – Kawa
- Kazakh – Kazakh
- Khakas – Khakas
- Khanty – Khanty
- Kikuyu – Kikuyu
- Kirgiz – Kirghiz
- Kongo – Kongo
- Korean – Korean
- KoreanHangul – Korean (Hangul)
- Koryak – Koryak
- Kpelle – Kpelle
- Kumyk – Kumyk
- Kurdish – Kurdish
L
- Lak – Lak
- Lappish – Sami (Lappish)
- Latin – Latin
- Latvian – Latvian
- LatvianGothic – Latvian language written in Gothic script
- Lezgin – Lezgin
- Lithuanian – Lithuanian
- Luba – Luba
M
- Macedonian – Macedonian
- Malagasy – Malagasy
- Malay – Malay
- Malinke – Malinke
- Maltese – Maltese
- Mansi – Mansi
- Maori – Maori
- Mari – Mari
- Maya – Maya
- Miao – Miao
- Minankabaw – Minangkabau
- Mohawk – Mohawk
- Mongol – Mongol
- Mordvin – Mordvin
N
- Nahuat – Nahuatl
- Nenets – Nenets
- Nivkh – Nivkh
- Nogay – Nogay
- Norwegian – Norwegian Nynorskand Norwegian Bokmal
- NorwegianBokmal – Norwegian (Bokmal)
- NorwegianNynorsk – Norwegian (Nynorsk)
- Nyanja – Nyanja
O
- Occidental – Occidental
- Ojibway – Ojibway
- OldEnglish – Old English
- OldFrench – Old French
- OldGerman – Old German
- OldItalian – Old Italian
- OldSlavonic – Old Slavonic
- OldSpanish – Old Spanish
- Ossetic – Ossetian
P
- Papiamento – Papiamento
- Pashto – Pashto
- PidginEnglish – Tok Pisin
- Polish – Polish
- PortugueseBrazilian – Portuguese (Brazil)
- PortugueseStandard – Portuguese (Portugal)
- Provencal – Provencal
Q
- Quechua – Quechua
R
- RhaetoRomanic – Rhaeto-Romanic
- Romanian – Romanian
- RomanianMoldavia – Romanian (Moldavia)
- Romany – Romany
- Ruanda – Ruanda
- Rundi – Rundi
- RussianOldSpelling – Russian (old spelling)
- Russian – Russian
- RussianWithAccent – Russian with accents marking stress position
S
- Samoan – Samoan
- Selkup – Selkup
- SerbianCyrillic – Serbian (Cyrillic)
- SerbianLatin – Serbian (Latin)
- Shona – Shona
- Sioux – Sioux (Dakota)
- Slovak – Slovak
- Slovenian – Slovenian
- Somali – Somali
- Sorbian – Sorbian
- Sotho – Sotho
- Spanish – Spanish
- Sunda – Sunda
- Swahili – Swahili
- Swazi – Swazi
- Swedish – Swedish
T
- Tabassaran – Tabassaran
- Tagalog – Tagalog
- Tahitian – Tahitian
- Tajik – Tajik
- Tatar – Tatar
- Thai – Thai
- Tinpo – Jingpo
- Tongan – Tongan
- Tswana – Tswana
- Tun – Tun
- Turkish – Turkish
- Turkmen – Turkmen
- TurkmenLatin – Turkmen (Latin)
- Tuvin – Tuvan
U
- Udmurt – Udmurt
- UighurCyrillic – Uighur (Cyrillic)
- UighurLatin – Uighur (Latin)
- Ukrainian – Ukrainian
- Urdu – Urdu
- UzbekCyrillic – Uzbek (Cyrillic)
- UzbekLatin – Uzbek (Latin)
V
- Vietnamese – Vietnamese
- Visayan – Cebuano
W
- Welsh – Welsh
- Wolof – Wolof
X
- Xhosa – Xhosa
Y
- Yakut – Yakut
- Yiddish – Yiddish
Z
- Zapotec – Zapotec
- Zulu – Zulu
Technical languages
- Basic – Basic programming language
- C++ – C/C++ programming language
- Chemistry – Simple chemical formulas
- Digits – Numbers
- CMC7 – For MICR (CMC-7) text type
- Cobol – Cobol programming language
- E13B – For MICR (E-13B) text type
- Fortran – Fortran programming language
- Java – Java programming language
- OcrA – For OCR-A text type
- OcrB – For OCR-B text type
- Pascal – Pascal programming language