Skip to main content

Overview

Agentic processing uses vision language models to enhance the accuracy of different types of content extraction. Itโ€™s enabled through the enhance.agentic configuration, which accepts a list of scopes to apply agentic processing to.

Scopes

Agentic processing can be applied to three different scopes:

Text ({"scope": "text"})

  • Use when: You need improved OCR accuracy for complex layouts or difficult-to-read text
  • What it does: Adds an extra pass to correct table/text mistakes using AI
  • Cost: Small additional cost per page

Table ({"scope": "table"})

  • Use when: Financial statements, regulatory filings, scientific tables, or anything with multi-row headers, merged cells, or misaligned columns
  • What it does: Reconstructs table structure to preserve row/column associations
  • Prompting tips: You can specify structural requirements such as aligning rows by company name, or ensuring that all numerical values remain consistently aligned within a single row
  • Cost: Additional cost based on table complexity

Figure ({"scope": "figure"})

  • Use when: You need enhanced figure summaries with better accuracy than the standard summarization
  • What it does: Uses advanced vision-language models to provide detailed figure analysis
  • Prompting tips: Specify what visual cues should be incorporated. Example: โ€œWhen provided a diagram, extract all of the figure content verbatim.โ€
  • Cost: Additional cost per figure processed

Enabling Agentic Processing

You can enable one or multiple agentic modes in a single request:

Single scope

{
  "enhance": {
    "agentic": [{"scope": "text"}]
  }
}

Multiple scopes

{
  "enhance": {
    "agentic": [
      {"scope": "text"},
      {"scope": "table", "prompt": "Align rows by company name"},
      {"scope": "figure", "prompt": "Extract all chart data verbatim"}
    ]
  }
}

With custom prompting

{
  "enhance": {
    "agentic": [
      {
        "scope": "table",
        "prompt": "Pay special attention to multi-row headers and ensure numerical alignment"
      }
    ]
  }
}

Migration from Legacy Config

If you were using the old experimental_options.enrich configuration or options.figure_summary.enhanced, hereโ€™s how to migrate:

Legacy format

{
  "experimental_options": {
    "enrich": {
      "enabled": True,
      "mode": "table",
      "prompt": "Align rows by company name"
    }
  },
  "options": {
    "figure_summary": {
      "enabled": True,
      "enhanced": True
    }
  }
}

2025-10-14 format

{
  "enhance": {
    "agentic": [
      {
        "scope": "table",
        "prompt": "Align rows by company name"
      },
      {
        "scope": "figure"
      }
    ],
    "summarize_figures": True
  }
}

Best Practices

  • Start with a single scope to understand the impact on your specific documents
  • Use custom prompts to guide the model toward your specific needs
  • Consider the cost-accuracy tradeoff for your use case
  • For tables, be specific about structural requirements in your prompt
  • For figures, describe what visual elements are most important to capture
โŒ˜I