Skip to main content
Reducto provides several options to control how Optical Character Recognition (OCR) is performed on your documents. These options allow you to fine-tune the OCR process based on your specific needs.

Agentic OCR

Agentic OCR enables automatic editing of OCR results using vision language models, which can improve accuracy for complex tables (merged cells, nested headers, etc) and tricky text (handwriting, small symbols). To enable agentic OCR, use the enhance.agentic parameter:
client.parse.run(
    input=upload,
    enhance={
        "agentic": [{"scope": "text"}]
    }
)

When to use agentic OCR?

Consider using agentic OCR when:
  1. Accuracy is critical for your application, and you’re seeing small discrepancies in standard OCR.
  2. You’re willing to accept a small increase in processing time and cost (2x credits) for improved accuracy.
Agentic OCR uses AI to automatically edit and correct OCR results, which can significantly improve the quality of extractions.

OCR system

For advanced users, the settings.ocr_system parameter allows you to specify which OCR system to use:
  • standard (default): Our best multilingual OCR system that handles documents with languages of all kinds, including English, Spanish, Italian, Portuguese, French, German, and many others.
  • legacy: Only supports Germanic languages (English, German, Dutch, etc.) and is available for backwards compatibility.
client.parse.run(
    input=upload,
    settings={
        "ocr_system": "standard"
    }
)
We recommend using the standard OCR system for all new projects, as it provides the best accuracy across all languages.
I