Overview

Enrich is a configuration that performs post-parse enrichment to improve document processing accuracy. It’s enabled through the experimental_options.enrich configuration.

Modes

There are three modes: standard, page, and table. The mode is the scope of the enrichment. standard
  • Use when: We don’t often recommend using the standard mode, as it’s a more legacy option.
  • What it does: Tweaks reading order.
page
  • Use when: The visual layout looks bad and you do not need exact bounding boxes. Any scenario where directly interpreting the page image would improve the reading order and grouping.
  • What it does: Re-reads the page with a vision-language model to produce cleaner text flow.
table
  • Use when: Financial statements, regulatory filings, scientific tables, or anything with multi-row headers, merged cells, or misaligned columns.
  • What it does: Reconstructs table structure to preserve row/column associations. Similar to enabling html and ai_json mode in advanced_options.table_output_format.
  • Prompting tips: For the table mode, you can specify structural requirements such as aligning rows by company name, or ensuring that all numerical values remain consistently aligned within a single row.

Enabling enrich

"experimental_options": {
    "enrich": {
        "enabled": True,
        "mode": "table",
        "prompt": "Align rows by company name"
    },
}