> ## Documentation Index
> Fetch the complete documentation index at: https://docs.reducto.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Migration Guide: V2 to V3 Config

> Complete guide for migrating from Legacy (V2) to 2025-10-14 (V3) configuration format

<Warning>
  The V2 configuration format will eventually be deprecated. All V2 API calls will still be functional in the meantime. Please migrate to the V3 version for the latest features and improvements.
</Warning>

## Overview

The 2025-10-14 release introduces a restructured configuration format (v3) that provides better organization and clarity. The redesign is mainly a structual change, as underlying API calls should function the same, however new features post V3 will be released on the V3 version. This guide will help you migrate from the Legacy (v2) configuration format to the new format.

### Convert your V2 config to V3

<div style={{ margin: "1.5em 0", border: "1px solid #e5e7eb", borderRadius: "8px", overflow: "hidden" }}>
  <iframe
    src="https://reductoai-dev--v2-to-v3-converter-run.modal.run/"
    style={{
  width: "100%",
  minHeight: "600px",
  border: "none"
}}
    title="Reducto V2 to V3 Config Converter"
    allow="clipboard-read; clipboard-write"
  />
</div>

## Key Changes

### 1. Input Parameter

The `document_url` parameter has been renamed to `input` for clarity:

**Legacy (v2)**

```python theme={null}
client.parse.run(document_url="https://example.com/doc.pdf")
```

**2025-10-14 (v3)**

```python theme={null}
client.parse.run(input="https://example.com/doc.pdf")
```

### 2. Configuration Structure Reorganization

The configuration options have been reorganized into more logical groupings:

* `enhance`: AI-powered enhancements (agentic modes, figure summarization)
* `retrieval`: RAG-focused settings (chunking, filtering, embedding optimization)
* `formatting`: Output format controls (tables, page markers, markup)
* `spreadsheet`: Spreadsheet-specific settings
* `settings`: General settings (OCR system, timeouts, passwords)

## Complete Mapping Reference

### Parse Configuration

#### Basic Options → Multiple Categories

| Legacy (v2)                            | 2025-10-14 (v3)                                          | Notes                       |
| -------------------------------------- | -------------------------------------------------------- | --------------------------- |
| `document_url`                         | `input`                                                  | Renamed for clarity         |
| `options.ocr_mode="agentic"`           | `enhance.agentic=[{"scope": "text"}]`                    | Agentic text mode           |
| `options.extraction_mode`              | *Removed*                                                | No longer configurable      |
| `options.chunking`                     | `retrieval.chunking`                                     | Moved to retrieval category |
| `options.table_summary.enabled`        | `retrieval.embedding_optimized`                          | Simplified to boolean       |
| `options.figure_summary.enabled=True`  | `enhance.summarize_figures=True`                         | Moved to enhance            |
| `options.figure_summary.enhanced=True` | `enhance.agentic=[{"scope": "figure"}]`                  | Now uses agentic            |
| `options.figure_summary.prompt`        | `enhance.agentic=[{"scope": "figure", "prompt": "..."}]` | Custom prompting            |
| `options.filter_blocks`                | `retrieval.filter_blocks`                                | Moved to retrieval          |
| `options.force_url_result`             | `settings.force_url_result`                              | Moved to settings           |

#### Advanced Options → Multiple Categories

| Legacy (v2)                                                   | 2025-10-14 (v3)                                      | Notes                |
| ------------------------------------------------------------- | ---------------------------------------------------- | -------------------- |
| `advanced_options.ocr_system="highres"`                       | `settings.ocr_system="standard"`                     | Values changed       |
| `advanced_options.ocr_system="multilingual"`                  | `settings.ocr_system="standard"`                     | Now uses standard    |
| `advanced_options.ocr_system="legacy"`                        | `settings.ocr_system="legacy"`                       | Same                 |
| `advanced_options.table_output_format`                        | `formatting.table_output_format`                     | Moved to formatting  |
| `advanced_options.merge_tables`                               | `formatting.merge_tables`                            | Moved to formatting  |
| `advanced_options.keep_line_breaks`                           | `settings.alpha.keep_line_breaks`                    | Moved to alpha       |
| `advanced_options.add_page_markers`                           | `formatting.add_page_markers`                        | Moved to formatting  |
| `advanced_options.page_range`                                 | `settings.page_range`                                | Moved to settings    |
| `advanced_options.document_password`                          | `settings.document_password`                         | Moved to settings    |
| `advanced_options.read_comments=True`                         | `formatting.include=["comments"]`                    | Now in list          |
| `advanced_options.enable_change_tracking=True`                | `formatting.include=["change_tracking"]`             | Now in list          |
| `advanced_options.enable_highlight_detection=True`            | `formatting.include=["highlight"]`                   | Now in list          |
| `advanced_options.persist_results`                            | `settings.persist_results`                           | Moved to settings    |
| `advanced_options.return_ocr_data`                            | `settings.return_ocr_data`                           | Moved to settings    |
| `advanced_options.large_table_chunking`                       | `spreadsheet.split_large_tables`                     | Moved to spreadsheet |
| `advanced_options.spreadsheet_table_clustering="default"`     | `spreadsheet.clustering="fast"`                      | Values changed       |
| `advanced_options.spreadsheet_table_clustering="intelligent"` | `spreadsheet.clustering="accurate"`                  | Values changed       |
| `advanced_options.spreadsheet_table_clustering="disabled"`    | `spreadsheet.clustering="disabled"`                  | Same                 |
| `advanced_options.include_formula_information=True`           | `spreadsheet.include=["formula"]`                    | Now in list          |
| `advanced_options.include_color_information=True`             | `spreadsheet.include=["cell_colors"]`                | Now in list          |
| `advanced_options.exclude_hidden_sheets=True`                 | `spreadsheet.exclude=["hidden_sheets"]`              | Now in list          |
| `advanced_options.exclude_hidden_rows_cols=True`              | `spreadsheet.exclude=["hidden_rows", "hidden_cols"]` | Now in list          |
| `advanced_options.force_file_extension`                       | `settings.force_file_extension`                      | Moved to settings    |

#### Experimental Options → Multiple Categories

| Legacy (v2)                                              | 2025-10-14 (v3)                                         | Notes             |
| -------------------------------------------------------- | ------------------------------------------------------- | ----------------- |
| `experimental_options.enrich.enabled=True, mode="table"` | `enhance.agentic=[{"scope": "table"}]`                  | Now uses agentic  |
| `experimental_options.enrich.prompt`                     | `enhance.agentic=[{"scope": "table", "prompt": "..."}]` | Custom prompting  |
| `experimental_options.return_figure_images=True`         | `settings.return_images=["figure"]`                     | Now in list       |
| `experimental_options.return_table_images=True`          | `settings.return_images=["table"]`                      | Now in list       |
| `experimental_options.embed_text_metadata_pdf`           | `settings.embed_pdf_metadata`                           | Renamed           |
| `experimental_options.timeout`                           | `settings.timeout`                                      | Moved to settings |

### Extract Configuration

| Legacy (v2)                    | 2025-10-14 (v3)                           | Notes                       |
| ------------------------------ | ----------------------------------------- | --------------------------- |
| `document_url`                 | `input`                                   | Renamed                     |
| `schema`                       | `instructions.schema`                     | Nested in instructions      |
| `system_prompt`                | `instructions.system_prompt`              | Nested in instructions      |
| `parse_config`                 | `parsing`                                 | Renamed (uses ParseOptions) |
| `include_images`               | `settings.include_images`                 | Moved to settings           |
| `generate_citations`           | `settings.citations.enabled`              | Nested in citations         |
| `array_extract`                | `settings.array_extract`                  | Moved to settings           |
| `options.numerical_confidence` | `settings.citations.numerical_confidence` | Nested in citations         |
| `latency_sensitive`            | `settings.optimize_for_latency`           | Renamed                     |

### Extract Response Format

The extract response format has changed significantly:

**Legacy (v2)**

```json theme={null}
{
  "result": [{"field1": "value1", "field2": "value2"}],
  "citations": [{"field1": [...], "field2": [...]}]
}
```

**2025-10-14 (v3)**

```json theme={null}
{
  "result": {
    "field1": {
      "value": "value1",
      "citations": [...]
    },
    "field2": {
      "value": "value2",
      "citations": [...]
    }
  }
}
```

## Migration Examples

### Example 1: Basic Parse with Agentic OCR

**Legacy (v2)**

```python theme={null}
result = client.parse.run(
    document_url=upload,
    options={
        "ocr_mode": "agentic"
    }
)
```

**2025-10-14 (v3)**

```python theme={null}
result = client.parse.run(
    input=upload,
    enhance={
        "agentic": [{"scope": "text"}]
    }
)
```

### Example 2: Parse with Multiple Configurations

**Legacy (v2)**

```python theme={null}
result = client.parse.run(
    document_url=upload,
    options={
        "ocr_mode": "agentic",
        "chunking": {"chunk_mode": "variable"},
        "table_summary": {"enabled": True},
        "figure_summary": {"enabled": True, "enhanced": True}
    },
    advanced_options={
        "ocr_system": "multilingual",
        "table_output_format": "html",
        "page_range": {"start": 1, "end": 10},
        "enable_change_tracking": True
    },
    experimental_options={
        "enrich": {"enabled": True, "mode": "table"}
    }
)
```

**2025-10-14 (v3)**

```python theme={null}
result = client.parse.run(
    input=upload,
    enhance={
        "agentic": [
            {"scope": "text"},
            {"scope": "figure"},
            {"scope": "table"}
        ],
        "summarize_figures": True
    },
    retrieval={
        "chunking": {"chunk_mode": "variable"},
        "embedding_optimized": True
    },
    formatting={
        "table_output_format": "html",
        "include": ["change_tracking"]
    },
    settings={
        "ocr_system": "standard",
        "page_range": {"start": 1, "end": 10}
    }
)
```

### Example 3: Extract Configuration

**Legacy (v2)**

```python theme={null}
result = client.extract.run(
    document_url=upload,
    schema=my_schema,
    system_prompt="Be precise and thorough.",
    generate_citations=True,
    array_extract=True,
    options={"numerical_confidence": True}
)
```

**2025-10-14 (v3)**

```python theme={null}
result = client.extract.run(
    input=upload,
    instructions={
        "schema": my_schema,
        "system_prompt": "Be precise and thorough."
    },
    settings={
        "array_extract": True,
        "citations": {
            "enabled": True,
            "numerical_confidence": True
        }
    }
)
```

### Example 4: Spreadsheet Processing

**Legacy (v2)**

```python theme={null}
result = client.parse.run(
    document_url=upload,
    advanced_options={
        "large_table_chunking": {"enabled": True, "size": 100},
        "spreadsheet_table_clustering": "intelligent",
        "include_formula_information": True,
        "include_color_information": True,
        "exclude_hidden_sheets": True
    }
)
```

**2025-10-14 (v3)**

```python theme={null}
result = client.parse.run(
    input=upload,
    spreadsheet={
        "split_large_tables": {"enabled": True, "size": 100},
        "clustering": "accurate",
        "include": ["formula", "cell_colors"],
        "exclude": ["hidden_sheets"]
    }
)
```

## Async Configuration

The async configuration structure remains similar but uses the `async` parameter:

**Legacy (v2)**

```python theme={null}
from reducto.models import WebhookConfig

result = client.parse.run_async(
    document_url=upload,
    async_config={
        "webhook": WebhookConfig(url="https://example.com/webhook"),
        "priority": True
    }
)
```

**2025-10-14 (v3)**

```python theme={null}
result = client.parse.run_async(
    input=upload,
    async={
        "webhook": {"mode": "direct", "url": "https://example.com/webhook"},
        "priority": True
    }
)
```

## Breaking Changes Checklist

When migrating your code, make sure to:

* [ ] Replace all `document_url` with `input`
* [ ] Move `ocr_mode="agentic"` to `enhance.agentic=[{"scope": "text"}]`
* [ ] Update `ocr_system` values (highres/multilingual → standard)
* [ ] Replace `table_summary.enabled` with `retrieval.embedding_optimized`
* [ ] Move figure/table enhancements to `enhance.agentic`
* [ ] Convert boolean flags to list entries where applicable (e.g., `enable_change_tracking` → `formatting.include=["change_tracking"]`)
* [ ] Update spreadsheet clustering values (default → fast, intelligent → accurate)
* [ ] Restructure extract response handling to use nested value/citations format
* [ ] Move extract `schema` and `system_prompt` into `instructions` object
* [ ] Update citation handling in extract to use the new nested format

## Need Help?

If you encounter issues during migration:

1. Check the [API Reference](https://docs.reducto.ai/api-reference) for the 2025-10-14 version
2. Review the [configuration examples](https://docs.reducto.ai/parsing/default-configurations) in the new version
3. Contact support at [support@reducto.ai](mailto:support@reducto.ai)
