The standard OCR handles mixed-language documents automatically. A single document can contain text in multiple languages without any special configuration.
Upload your multilingual document to studio.reducto.ai. In the Parse view, open the Configurations tab to see OCR settings.
Key settings:
Extraction Mode: Use ocr for scanned documents where text is embedded as images. Use hybrid (default) for mixed documents where some pages are native text and others are scans.
OCR System: Keep standard (default) for 60+ language support. The legacy system only supports Germanic languages.
2
View Extracted Text
Click Run and switch to the Results tab. Reducto extracts text in the original language with proper character encoding.
Notice how the Spanish text is extracted accurately, including accented characters (á, é, í, ó, ú, ñ) and proper formatting.
Naciones UnidasS/2025/856Consejo de SeguridadDistr. general30 de diciembre de 2025EspañolOriginal: inglésCarta de fecha 29 de diciembre de 2025 dirigida a laPresidencia del Consejo de Seguridad por el Secretario GeneralTengo el honor de referirme a la resolución 2719 (2023) del Consejo deSeguridad, por la que el Consejo estableció el marco para financiar lasoperaciones de paz...
# For scanned documents (images, old PDFs)result = client.parse.run( input=upload.file_id, settings={ "extraction_mode": "ocr" # Force OCR, ignore embedded text })# For native PDFs with embedded textresult = client.parse.run( input=upload.file_id, settings={ "extraction_mode": "metadata" # Use embedded text only })# For mixed documents (default)result = client.parse.run( input=upload.file_id, settings={ "extraction_mode": "hybrid" # Use metadata first, OCR as fallback })
Documents containing multiple languages are handled automatically:
Report incorrect code
Copy
Ask AI
# A document with English headers and Spanish contentresult = client.parse.run(input=upload.file_id)# Both languages are extracted correctly# No configuration needed
AGREEMENT / ACUERDOThis agreement ("Agreement") is entered into between...Este acuerdo ("Acuerdo") se celebra entre...TERMS AND CONDITIONS / TÉRMINOS Y CONDICIONES1. Definitions / Definiciones The following terms shall have the meanings set forth below... Los siguientes términos tendrán los significados establecidos a continuación...
Reducto extracts both English and Spanish text accurately.
Standard OCR works well for clean, printed documents. For challenging documents like handwriting, faded text, or unusual fonts, agentic mode uses a vision language model to verify and correct OCR output.
Report incorrect code
Copy
Ask AI
result = client.parse.run( input=upload.file_id, enhance={ "agentic": [{"scope": "text"}] })
Use agentic mode when:
Text is handwritten or uses decorative fonts
Document is faded, stained, or low quality
OCR produces garbled output on first pass
Agentic mode costs approximately 2x credits. Use it selectively for documents where standard OCR struggles.