Introduction
Reducto applies a set of default parsing configurations to every request. These defaults are designed to cover the most common document types and workflows, so you can start parsing without specifying every parameter in your API calls. By understanding the defaults, you can:- Avoid sending long, repetitive configuration lists with each request
- Decide when to override a setting for your specific use case
- Use parse output as the foundation for extraction, making it a good place to debug issues
Default configurations (Parse)
Start with defaults. Add overrides only where your workflow requires different behavior.
Basic options
Basic options
Setting | Default | Description |
---|---|---|
options.ocr_mode | standard | Whether or not to use Agentic OCR. |
options.extraction_mode | ocr | Extraction source for text content. |
options.chunking.chunk_mode | variable | Chunking strategy for parsed text. |
options.table_summary.enabled | false | Include AI-generated table summaries. |
options.figure_summary.enabled | false | Include AI-generated figure/image summaries. |
options.filter_blocks | [] | Block types to exclude from output. |
options.force_url_result | false | Force returning result via URL (vs inline JSON). |
Advanced options
Advanced options
Setting | Default | Description |
---|---|---|
advanced_options.ocr_system | highres | OCR system preset. |
advanced_options.table_output_format | html | Table rendering format in output. |
advanced_options.merge_tables | false | Merge adjacent tables on a page. |
advanced_options.include_formula_information | false | Include spreadsheet formula details. |
advanced_options.continue_hierarchy | true | Preserve document hierarchy across chunks. |
advanced_options.large_table_chunking.enabled | true | Split very large tables into chunks. |
advanced_options.large_table_chunking.size | 50 | Max rows per large-table chunk. |
advanced_options.spreadsheet_table_clustering | default | Splits up tables inside tables with multiple. |
advanced_options.add_page_markers | false | Insert page boundary markers into text. |
advanced_options.remove_text_formatting | false | Strip bold/italics and text styling. |
advanced_options.return_ocr_data | false | Return low-level OCR words/lines data. |
advanced_options.filter_line_numbers | false | Remove leading line numbers. |
advanced_options.read_comments | false | Parses comments from the PDF. |
advanced_options.persist_results | false | Persist outputs for later retrieval. |
advanced_options.exclude_hidden_sheets | false | Skip hidden sheets in spreadsheets. |
advanced_options.exclude_hidden_rows_cols | false | Skip hidden rows/columns in spreadsheets. |
advanced_options.enable_change_tracking | false | Detects strikethrough and underlines, and adds scripts. |
Experimental options
Experimental options
Setting | Default | Description |
---|---|---|
experimental_options.enrich.enabled | false | Enable post-parse enrichment pass. |
experimental_options.enrich.mode | standard | Block types to be enriched. |
experimental_options.native_office_conversion | false | Use Windows VM instead of LibreOffice to convert files. |
experimental_options.enable_checkboxes | false | Detect and return checkbox fields. |
experimental_options.enable_equations | false | Detect and return math equations. |
experimental_options.rotate_pages | true | Auto-rotate misoriented pages. |
experimental_options.rotate_figures | false | Auto-rotate figures/images. This is separate from page rotation. |
experimental_options.enable_scripts | false | Detect and return subscripts and superscripts. |
experimental_options.return_figure_images | false | Return figure images. |
experimental_options.return_table_images | false | Return table images. |
experimental_options.layout_model | default | Layout analysis model. Beta is newer. |
experimental_options.embed_text_metadata_pdf | false | Write OCR text layer back into returned PDF. |
experimental_options.danger_filter_wide_boxes | false | Do not use. Filter overly wide bounding boxes. |
Defaults in action
Both of these calls are equivalent — the first sets every default explicitly, while the second relies on built-in defaults.Explicit configuration example
Default configuration example
When to Override Defaults
Most of the time, you can rely on the default configurations. However, use cases and document formats vary widely, here are a few examples:-
Disable page auto-rotation
If you don’t expect scans or skewed content, you can disablerotate_pages
for a reduction in latency. -
Return figures and tables
Helpful for research papers or scientific documents where charts and illustrations are common. Enablereturn_figure_images
andreturn_table_images
. -
Different languages
Usemultilingual
mode forocr_system
for documents that have non-Germanic languages.
Next Steps
- Learn more about all available parameters in the Parse API Reference.
- Try different configurations interactively in the Studio Playground.
- Continue to Extraction to see how parse output is used as the foundation for structured data.