When to use
- Legal and compliance review (redlines in contracts and policies)
- Editorial review (what changed between versions)
- PDF review workflows that rely on sticky notes or text comments
How-to: enable change tracking
Add HTML tags around text formatting to detect document changes.Configuration
Requirements: Only works with
hybrid
or metadata
extraction mode (not ocr
).Output
<change><u>underlined text</u></change>
for underlined text<change><s>deleted text</s></change>
for strikethrough text<change><s>old</s> <u>new</u></change>
for change sequences
How-to: detect PDF comments
Extract text annotations from PDF documents with their content and locations.Configuration
Output
Comments include content and normalized bounding box coordinates:bbox
array contains [left, top, width, height]
normalized to [0,1] relative to page dimensions.