split_description
Thesplit_description array defines what sections to look for. Each entry has three fields:
Writing Effective Descriptions
The description is passed to an LLM that classifies each page. Vague descriptions lead to ambiguous classifications.- Content type (tables, narrative text, forms)
- Position (beginning, end, after section X)
- Visual elements (headers, logos, signature lines)
- What it does NOT include (to avoid confusion with similar sections)
partition_key
Partition key handles a common scenario: the same section type repeating for different entities. A consolidated statement has holdings for multiple accounts. A medical record packet has intake forms for multiple patients. Without partition key, Split returns all matching pages as one group. You’d then need to figure out where one entity ends and the next begins. Partition key does this automatically.name in each partition is the actual value extracted from the document. If the document shows “Account #1234-5678” on pages 1-3 and “Account #8765-4321” on pages 7-11, those become your partition names.
The partition key is semantic, not literal. If you set partition_key to “account number” but the document says “Acct #1234” or “Portfolio ID: 5678”, Split will still find it. Describe what the identifier represents, not the exact text format.
When partition_key values appear in tables
By default, Split truncates table content to speed up processing. If your partition key values appear deep within tables (not in headers or the first few rows), the truncation might hide them. Settable_cutoff to preserve to keep full table content:
split_rules
Controls how pages are assigned to sections. The default rule:settings
table_cutoff
Controls how table content is processed during section detection.parsing
Split runs Parse internally before classifying sections. Theparsing parameter accepts all Parse configuration options.
jobid:// as input (reusing a previous Parse result), the parsing options are ignored since the document was already parsed.
Response Structure
split_description.
splits[].name: The name you provided.
splits[].pages: Page numbers where this section appears (1-indexed).
splits[].conf: "high" or "low" indicating classification confidence.
splits[].partitions: When using partition_key, sub-sections grouped by extracted identifier values. Each partition has its own name (the extracted value), pages, and conf.
section_mapping: Legacy format mapping section names to page arrays. Use splits for new code.
A section not found in the document still appears in results with an empty pages array. Always check that pages has content before processing.