Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.reducto.ai/llms.txt

Use this file to discover all available pages before exploring further.

Extract underlines, strikethroughs, and PDF text annotations. Returned text includes HTML markup that identifies insertions and deletions, and annotations include normalized location data.

When to use

  • Legal and compliance review (redlines in contracts and policies)
  • Editorial review (what changed between versions)
  • PDF review workflows that rely on sticky notes or text comments

How-to: enable change tracking

Add HTML tags around text formatting to detect document changes.

Configuration

Requirements: Only works with hybrid or metadata extraction mode (not ocr).
{
  "document_url": "https://example.com/document.pdf",
  "options": {
    "extraction_mode": "hybrid"
  },
  "advanced_options": {
    "enable_change_tracking": true
  }
}

Output

  • <change><u>underlined text</u></change> for underlined text
  • <change><s>deleted text</s></change> for strikethrough text
  • <change><s>old</s> <u>new</u></change> for change sequences

How-to: detect PDF comments

Extract text annotations from PDF documents with their content and locations.

Configuration

{
  "document_url": "https://example.com/annotated.pdf",
  "advanced_options": {
    "read_comments": true
  }
}

Output

Comments include content and normalized bounding box coordinates:
{
  "content": "Review comment text",
  "bbox": [0.1, 0.2, 0.3, 0.4]
}
The bbox array contains [left, top, width, height] normalized to [0,1] relative to page dimensions.