table_output_format
parameter in the advanced options.
Available formats
Dynamic format
The dynamic format (dynamic
) automatically chooses between markdown and HTML based on table complexity:
- Uses markdown for simple tables (≤ 30 cells and ≤ 4 merged cells)
- Uses HTML for complex tables
HTML format
html
) returns tables as HTML strings with proper support for:
- Table headers (
<th>
tags) - Merged cells (using
rowspan
andcolspan
attributes) - Complex table structures
- Cell formatting
Markdown format
md
) returns tables in GitHub-flavored markdown format. This is useful when:
- You need a human-readable format
- You’re displaying the content in markdown viewers
- You want simpler table representation
- The table doesn’t have complex merged cells
JSON format
json
) returns tables as nested arrays where:
- The outer array represents rows
- Each inner array represents cells in that row
- First row typically contains headers
- All cell values are strings
JSON with bounding boxes
jsonbbox
) extends the JSON format by including positional information for each cell. The coordinates are normalized to [0,1] range where:
x
: Distance from left edge of the pagey
: Distance from top edge of the pagewidth
: Cell width as percentage of page widthheight
: Cell height as percentage of page height
CSV format
csv
) returns tables in comma-separated values format. This is useful when:
- You need to import the data into spreadsheet software
- You want a simple, widely-supported format
- The table structure is relatively simple
- You want to save on output tokens.
AI JSON format
The AI JSON format (ai_json
) uses a custom LVM to parse the table structure and return the underlying JSON data. This mode performs the best in cases where the underlying table structure is very complex and not strictly tabular or contains many artifacts.