- Verification: Confirm extractions are correct by seeing the source text
- Debugging: When values are wrong, citations show where the model looked
- User experience: Let users click from extracted data to the original location
Enabling Citations
Addcitations.enabled to your extraction settings:
Citation Structure
With citations enabled, the response format changes. Theresult becomes an object (instead of an array), and each value is wrapped with citation data:
Citation Fields
| Field | Description |
|---|---|
type | The block type where the value was found: Text, Table, Key Value, Title, etc. |
content | The source text that was extracted from. May include more context than just the value. |
bbox | Bounding box coordinates for the source location. |
confidence | Overall confidence as "high" or "low". |
granular_confidence | Detailed scores: extract_confidence (0-1) and parse_confidence (0-1). |
parentBlock | The larger Parse block containing this citation. Useful for understanding context. |
Bounding Box Coordinates
Coordinates are normalized to the range [0, 1] relative to page dimensions:| Coordinate | Meaning |
|---|---|
left | Distance from the left edge (0 = left margin, 1 = right margin) |
top | Distance from the top edge (0 = top, 1 = bottom) |
width | Width as fraction of page width |
height | Height as fraction of page height |
page | Page number (1-indexed) in the processed result |
original_page | Page number in the original document |
Working with Citations
Accessing Citation Data
Array Citations
For array fields, each item in the array has its own citations:Filtering by Confidence
Use confidence scores to flag uncertain extractions:Spreadsheet Citations
Excel and other spreadsheet formats use cell coordinates instead of normalized positions.Coordinate Differences
| Aspect | PDFs/Images | Spreadsheets |
|---|---|---|
left | Fraction (0-1) | Column number (1 = A, 2 = B) |
top | Fraction (0-1) | Row number (1-indexed) |
width | Fraction (0-1) | Columns spanned |
height | Fraction (0-1) | Rows spanned |
page | Page number | Sheet index (1 = first sheet) |
Example Spreadsheet Citation
Constraints and Limitations
Citations Disable Chunking
Citations require knowing exactly where each piece of content came from. Chunking merges content across boundaries, which would make citation coordinates ambiguous. When you enable citations:- Chunking is automatically disabled in the parsing step
- The document is processed as a single unit
- This may increase processing time for very long documents
parsing configuration, they’ll be ignored when citations are enabled.
Streaming Array Extract Incompatible
The streaming mode for array extraction cannot be used with citations. If you need both complete arrays and citations:Empty Citations
Citations may be empty for fields that were inferred rather than directly extracted:if field.citations: before accessing citation data.
Viewing in Studio
Every extraction response includes astudio_link. In Studio, citations become interactive:
- Click an extracted field to highlight its source in the document
- Click a highlight to jump to the corresponding field
- See all citations overlaid on the document at once