Overview
Enrich is a configuration that performs post-parse enrichment to improve document processing accuracy. It’s enabled through theexperimental_options.enrich
configuration.
Modes
There are three modes:standard
, page
, and table
. The mode is the scope of the enrichment.
standard
- Use when: We don’t often recommend using the
standard
mode, as it’s a more legacy option. - What it does: Tweaks reading order.
page
- Use when: The visual layout looks bad and you do not need exact bounding boxes. Any scenario where directly interpreting the page image would improve the reading order and grouping.
- What it does: Re-reads the page with a vision-language model to produce cleaner text flow.
table
- Use when: Financial statements, regulatory filings, scientific tables, or anything with multi-row headers, merged cells, or misaligned columns.
- What it does: Reconstructs table structure to preserve row/column associations. Similar to enabling
html
andai_json
mode inadvanced_options.table_output_format
. - Prompting tips: For the
table
mode, you can specify structural requirements such as aligning rows by company name, or ensuring that all numerical values remain consistently aligned within a single row.