The v2 configuration format (documented in this Legacy version) is deprecated. Please migrate to the 2025-10-14 version for the latest features and improvements.
Overview
The 2025-10-14 release introduces a restructured configuration format (v3) that provides better organization and clarity. This guide will help you migrate from the Legacy (v2) configuration format to the new format.
Key Changes
The document_url
parameter has been renamed to input
for clarity:
Legacy (v2)
client.parse.run(document_url="https://example.com/doc.pdf")
2025-10-14 (v3)
client.parse.run(input="https://example.com/doc.pdf")
2. Configuration Structure Reorganization
The configuration options have been reorganized into more logical groupings:
enhance
: AI-powered enhancements (agentic modes, figure summarization)
retrieval
: RAG-focused settings (chunking, filtering, embedding optimization)
formatting
: Output format controls (tables, page markers, markup)
spreadsheet
: Spreadsheet-specific settings
settings
: General settings (OCR system, timeouts, passwords)
Complete Mapping Reference
Parse Configuration
Basic Options → Multiple Categories
Legacy (v2) | 2025-10-14 (v3) | Notes |
---|
document_url | input | Renamed for clarity |
options.ocr_mode="agentic" | enhance.agentic=[{"scope": "text"}] | Agentic text mode |
options.extraction_mode | Removed | No longer configurable |
options.chunking | retrieval.chunking | Moved to retrieval category |
options.table_summary.enabled | retrieval.embedding_optimized | Simplified to boolean |
options.figure_summary.enabled=True | enhance.summarize_figures=True | Moved to enhance |
options.figure_summary.enhanced=True | enhance.agentic=[{"scope": "figure"}] | Now uses agentic |
options.figure_summary.prompt | enhance.agentic=[{"scope": "figure", "prompt": "..."}] | Custom prompting |
options.filter_blocks | retrieval.filter_blocks | Moved to retrieval |
options.force_url_result | settings.force_url_result | Moved to settings |
Advanced Options → Multiple Categories
Legacy (v2) | 2025-10-14 (v3) | Notes |
---|
advanced_options.ocr_system="highres" | settings.ocr_system="standard" | Values changed |
advanced_options.ocr_system="multilingual" | settings.ocr_system="standard" | Now uses standard |
advanced_options.ocr_system="legacy" | settings.ocr_system="legacy" | Same |
advanced_options.table_output_format | formatting.table_output_format | Moved to formatting |
advanced_options.merge_tables | formatting.merge_tables | Moved to formatting |
advanced_options.add_page_markers | formatting.add_page_markers | Moved to formatting |
advanced_options.page_range | settings.page_range | Moved to settings |
advanced_options.document_password | settings.document_password | Moved to settings |
advanced_options.read_comments=True | formatting.include=["comments"] | Now in list |
advanced_options.enable_change_tracking=True | formatting.include=["change_tracking"] | Now in list |
advanced_options.enable_highlight_detection=True | formatting.include=["highlight"] | Now in list |
advanced_options.persist_results | settings.persist_results | Moved to settings |
advanced_options.return_ocr_data | settings.return_ocr_data | Moved to settings |
advanced_options.large_table_chunking | spreadsheet.split_large_tables | Moved to spreadsheet |
advanced_options.spreadsheet_table_clustering="default" | spreadsheet.clustering="fast" | Values changed |
advanced_options.spreadsheet_table_clustering="intelligent" | spreadsheet.clustering="accurate" | Values changed |
advanced_options.spreadsheet_table_clustering="disabled" | spreadsheet.clustering="disabled" | Same |
advanced_options.include_formula_information=True | spreadsheet.include=["formula"] | Now in list |
advanced_options.include_color_information=True | spreadsheet.include=["cell_colors"] | Now in list |
advanced_options.exclude_hidden_sheets=True | spreadsheet.exclude=["hidden_sheets"] | Now in list |
advanced_options.exclude_hidden_rows_cols=True | spreadsheet.exclude=["hidden_rows", "hidden_cols"] | Now in list |
advanced_options.force_file_extension | settings.force_file_extension | Moved to settings |
Experimental Options → Multiple Categories
Legacy (v2) | 2025-10-14 (v3) | Notes |
---|
experimental_options.enrich.enabled=True, mode="table" | enhance.agentic=[{"scope": "table"}] | Now uses agentic |
experimental_options.enrich.prompt | enhance.agentic=[{"scope": "table", "prompt": "..."}] | Custom prompting |
experimental_options.return_figure_images=True | settings.return_images=["figure"] | Now in list |
experimental_options.return_table_images=True | settings.return_images=["table"] | Now in list |
experimental_options.embed_text_metadata_pdf | settings.embed_pdf_metadata | Renamed |
experimental_options.timeout | settings.timeout | Moved to settings |
Legacy (v2) | 2025-10-14 (v3) | Notes |
---|
document_url | input | Renamed |
schema | instructions.schema | Nested in instructions |
system_prompt | instructions.system_prompt | Nested in instructions |
parse_config | parsing | Renamed (uses ParseOptions) |
include_images | settings.include_images | Moved to settings |
generate_citations | settings.citations.enabled | Nested in citations |
array_extract | settings.array_extract | Moved to settings |
options.numerical_confidence | settings.citations.numerical_confidence | Nested in citations |
latency_sensitive | settings.optimize_for_latency | Renamed |
The extract response format has changed significantly:
Legacy (v2)
{
"result": [{"field1": "value1", "field2": "value2"}],
"citations": [{"field1": [...], "field2": [...]}]
}
2025-10-14 (v3)
{
"result": {
"field1": {
"value": "value1",
"citations": [...]
},
"field2": {
"value": "value2",
"citations": [...]
}
}
}
Migration Examples
Example 1: Basic Parse with Agentic OCR
Legacy (v2)
result = client.parse.run(
document_url=upload,
options={
"ocr_mode": "agentic"
}
)
2025-10-14 (v3)
result = client.parse.run(
input=upload,
enhance={
"agentic": [{"scope": "text"}]
}
)
Example 2: Parse with Multiple Configurations
Legacy (v2)
result = client.parse.run(
document_url=upload,
options={
"ocr_mode": "agentic",
"chunking": {"chunk_mode": "variable"},
"table_summary": {"enabled": True},
"figure_summary": {"enabled": True, "enhanced": True}
},
advanced_options={
"ocr_system": "multilingual",
"table_output_format": "html",
"page_range": {"start": 1, "end": 10},
"enable_change_tracking": True
},
experimental_options={
"enrich": {"enabled": True, "mode": "table"}
}
)
2025-10-14 (v3)
result = client.parse.run(
input=upload,
enhance={
"agentic": [
{"scope": "text"},
{"scope": "figure"},
{"scope": "table"}
],
"summarize_figures": True
},
retrieval={
"chunking": {"chunk_mode": "variable"},
"embedding_optimized": True
},
formatting={
"table_output_format": "html",
"include": ["change_tracking"]
},
settings={
"ocr_system": "standard",
"page_range": {"start": 1, "end": 10}
}
)
Legacy (v2)
result = client.extract.run(
document_url=upload,
schema=my_schema,
system_prompt="Be precise and thorough.",
generate_citations=True,
array_extract=True,
options={"numerical_confidence": True}
)
2025-10-14 (v3)
result = client.extract.run(
input=upload,
instructions={
"schema": my_schema,
"system_prompt": "Be precise and thorough."
},
settings={
"array_extract": True,
"citations": {
"enabled": True,
"numerical_confidence": True
}
}
)
Example 4: Spreadsheet Processing
Legacy (v2)
result = client.parse.run(
document_url=upload,
advanced_options={
"large_table_chunking": {"enabled": True, "size": 100},
"spreadsheet_table_clustering": "intelligent",
"include_formula_information": True,
"include_color_information": True,
"exclude_hidden_sheets": True
}
)
2025-10-14 (v3)
result = client.parse.run(
input=upload,
spreadsheet={
"split_large_tables": {"enabled": True, "size": 100},
"clustering": "accurate",
"include": ["formula", "cell_colors"],
"exclude": ["hidden_sheets"]
}
)
Async Configuration
The async configuration structure remains similar but uses the async
parameter:
Legacy (v2)
from reducto.models import WebhookConfig
result = client.parse.run_async(
document_url=upload,
async_config={
"webhook": WebhookConfig(url="https://example.com/webhook"),
"priority": True
}
)
2025-10-14 (v3)
result = client.parse.run_async(
input=upload,
async={
"webhook": {"mode": "direct", "url": "https://example.com/webhook"},
"priority": True
}
)
Breaking Changes Checklist
When migrating your code, make sure to:
Need Help?
If you encounter issues during migration:
- Check the API Reference for the 2025-10-14 version
- Review the configuration examples in the new version
- Contact support at support@reducto.ai