OCR systems overview
Reducto offers different OCR systems that can be specified using theocr_system
parameter in the advanced options:
- highres: Optimized for documents with English, Spanish, Italian, Portuguese, French, and German text.
- multilingual: Supports a comprehensive set of languages from around the world.
- combined: Uses a combination of OCR systems for improved results for multilingual documents at a small latency cost.
Languages supported by OCR system
Highres OCR system
Thehighres
OCR system is optimized for the following languages:
- English
- Spanish
- Italian
- Portuguese
- French
- German
Multilingual OCR system
Themultilingual
OCR system supports a much wider range of languages, categorized by their level of support:
Fully supported languages
The following languages are prioritized and regularly evaluated for quality: Afrikaans, Albanian, Arabic, Armenian, Belarusian, Bengali, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Filipino, Finnish, French, German, Greek, Gujarati, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Khmer, Korean, Lao, Latvian, Lithuanian, Macedonian, Malay, Malayalam, Marathi, Nepali, Norwegian, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swedish, Tagalog, Tamil, Telugu, Thai, Turkish, Ukrainian, Vietnamese, YiddishChoosing the right OCR system
- Use
highres
for documents primarily in English, Spanish, Italian, Portuguese, French, or German. - Use
multilingual
for documents containing languages beyond those supported byhighres
.