Documentation Index
Fetch the complete documentation index at: https://docs.reducto.ai/llms.txt
Use this file to discover all available pages before exploring further.
What is Deep Split?
Deep Split is an agentic splitting mode that iteratively refines its output to achieve near-perfect accuracy. Unlike standard split which classifies each page in a single pass, Deep Split runs an agentic loop that verifies and corrects its section assignments against the source document until a quality threshold is met. This is especially useful for complex documents where a single split pass may mislabel pages, miss boundaries between similar sections, or partition repeating sections inconsistently. Deep Split catches these issues by checking its own work and re-classifying until the results are accurate.When to Use It
Deep Split is designed for splitting tasks where accuracy is critical and the cost of errors is high. Common use cases include:- Consolidated financial statements, where holdings, transactions, and summaries repeat across many accounts and must be partitioned correctly.
- Insurance claim packets, where sections like medical records, billing statements, and adjuster notes are visually similar and easy to confuse.
- Loan and mortgage files, where dozens of disclosures, appraisals, and supporting documents are interleaved and ordering matters for downstream processing.
- Multi-patient medical record bundles, where intake forms, lab reports, and discharge summaries repeat per patient and must group cleanly by partition.
- Long legal binders, where exhibits, contracts, and addenda span hundreds of pages and section boundaries are ambiguous.
How to Use It
Enable Deep Split by settingdeep_split to true inside the settings object:
Best Practices
Write specific, distinguishing section descriptions
The agentic loop relies on yoursplit_description entries to verify whether each page is in the right section. Vague descriptions give the agent nothing concrete to check, so include content, position, and visual cues that distinguish each section from its neighbors.
- Repeating sections: “Holdings table for a single account. Each block starts with an account number header and ends before the next account header.”
- Visually similar sections: “Lab report. Contains the laboratory name in the header and a results table with reference ranges. Distinct from imaging reports, which contain narrative findings instead of tables.”
- Boundary-sensitive sections: “Signature page. Always the last page of the contract block, immediately before any exhibits.”
Use with partition_key for repeating sections
When the same section repeats for different entities (multiple accounts, multiple patients, multiple companies), pair Deep Split withpartition_key so the agent verifies both the section assignment and the partition value extracted from the page.
Pair with Parse configuration
Deep Split can only verify what Parse sees. If the underlying parse output is missing data (for example, a table is not detected or a header is misread), Deep Split will not be able to recover the missing signal. Consider enabling agentic mode for tables, or using a higher-fidelity OCR mode when section identifiers live deep inside tables or in low-quality scans.Related
Split Overview
Endpoint basics and parameters.
Split Configuration
split_description, partition_key, and table_cutoff.
Deep Extract
The same agentic loop pattern for schema-based extraction.