Change tracking & PDF comments

Extract underlines, strikethroughs, and PDF text annotations. Returned text includes HTML markup that identifies insertions and deletions, and annotations include normalized location data.

When to use

Legal and compliance review (redlines in contracts and policies)
Editorial review (what changed between versions)
PDF review workflows that rely on sticky notes or text comments

How-to: enable change tracking

Add HTML tags around text formatting to detect document changes.

Configuration

Requirements: Only works with hybrid or metadata extraction mode (not ocr).

{
  "document_url": "https://example.com/document.pdf",
  "options": {
    "extraction_mode": "hybrid"
  },
  "advanced_options": {
    "enable_change_tracking": true
  }
}

Output

<change><u>underlined text</u></change> for underlined text
<change><s>deleted text</s></change> for strikethrough text
<change><s>old</s> <u>new</u></change> for change sequences

How-to: detect PDF comments

Extract text annotations from PDF documents with their content and locations.

Configuration

{
  "document_url": "https://example.com/annotated.pdf",
  "advanced_options": {
    "read_comments": true
  }
}

Output

Comments include content and normalized bounding box coordinates:

{
  "content": "Review comment text",
  "bbox": [0.1, 0.2, 0.3, 0.4]
}

The bbox array contains [left, top, width, height] normalized to [0,1] relative to page dimensions.

⌘I

Documentation Index

​When to use

​How-to: enable change tracking

​Configuration

​Output

​How-to: detect PDF comments

​Configuration

​Output

When to use

How-to: enable change tracking

Configuration

Output

How-to: detect PDF comments

Configuration

Output