Process documents in 60+ languages with automatic language detection. No configuration required.Documentation Index
Fetch the complete documentation index at: https://docs.reducto.ai/llms.txt
Use this file to discover all available pages before exploring further.
Sample Document
Download the sample: un-document-spanish.pdf
Supported Languages
Reducto automatically detects and processes these languages:View all 60+ supported languages
View all 60+ supported languages
| Region | Languages |
|---|---|
| European | English, German, French, Spanish, Portuguese, Italian, Dutch, Polish, Romanian, Czech, Greek, Hungarian, Swedish, Danish, Finnish, Norwegian, Bulgarian, Croatian, Slovak, Slovenian, Lithuanian, Latvian, Estonian, Albanian, Icelandic, Catalan, Serbian, Macedonian, Belarusian, Ukrainian |
| Asian | Chinese, Japanese, Korean, Hindi, Bengali, Tamil, Telugu, Marathi, Gujarati, Kannada, Malayalam, Punjabi, Thai, Vietnamese, Indonesian, Malay, Filipino/Tagalog, Khmer, Lao, Nepali |
| Middle Eastern | Arabic, Hebrew, Persian, Turkish |
| Other | Russian, Armenian, Yiddish, Afrikaans |
Create API Key
Open Studio
Go to studio.reducto.ai and sign in. From the home page, click API Keys in the left sidebar.

View API Keys
The API Keys page shows your existing keys. Click + Create new API key in the top right corner.

Configure Key
In the modal, enter a name for your key and set an expiration policy (or select “Never” for no expiration). Click Create.

Studio Walkthrough
Upload and Configure OCR
Upload your multilingual document to studio.reducto.ai. In the Parse view, open the Configurations tab to see OCR settings.
Key settings:

- Extraction Mode: Use
ocrfor scanned documents where text is embedded as images. Usehybrid(default) for mixed documents where some pages are native text and others are scans. - OCR System: Keep
standard(default) for 60+ language support. Thelegacysystem only supports Germanic languages.
Processing Non-English Documents
Basic Usage
No special configuration needed - just parse as usual:Output Example
From a Spanish UN Security Council document:OCR Configuration Options
Extraction Modes
Choose the right mode for your document type:| Mode | Best For | Speed | Accuracy |
|---|---|---|---|
hybrid | Mixed document sets | Fast | High |
ocr | Scanned documents | Slower | High |
metadata | Native PDFs | Fastest | Depends on PDF quality |
OCR System Selection
Always usestandard for multilingual support:
Mixed-Language Documents
Documents containing multiple languages are handled automatically:Example: Bilingual Contract
Agentic Mode for Difficult Text
Standard OCR works well for clean, printed documents. For challenging documents like handwriting, faded text, or unusual fonts, agentic mode uses a vision language model to verify and correct OCR output.- Text is handwritten or uses decorative fonts
- Document is faded, stained, or low quality
- OCR produces garbled output on first pass
Agentic mode costs approximately 2x credits. Use it selectively for documents where standard OCR struggles.
Extracting Structured Data
Extract structured data from non-English documents using schemas with descriptive field hints:Tips
For best results with non-English documents:- Use high-quality scans (300 DPI minimum) for better OCR accuracy
- Enable agentic mode for handwritten or degraded text
- Provide bilingual field descriptions in extraction schemas to improve accuracy
- Use
extraction_mode: "ocr"for scanned documents instead of relying on embedded text
Next Steps
OCR Settings
Full OCR configuration reference
Agentic Modes
AI-enhanced text correction
Batch Processing
Process many documents at scale
Extract API
Structured data extraction

