Reducto can extract numerical data from visualizations and output it as structured tables. This page covers how to configure chart extraction and what chart types are supported.Documentation Index
Fetch the complete documentation index at: https://docs.reducto.ai/llms.txt
Use this file to discover all available pages before exploring further.
Three Levels of Chart Processing
Reducto offers three ways to process charts, each with different accuracy/cost tradeoffs:| Level | Configuration | What it does |
|---|---|---|
| Basic | summarize_figures: True (default) | Text descriptions for RAG search |
| Enhanced | {"scope": "figure"} | Better models, structured extraction for simpler charts |
| Advanced | {"scope": "figure", "advanced_chart_agent": True} | Multi-stage pipeline for precise numerical extraction |
Basic: Figure Summarization
Enabled by default. Generates natural language descriptions using a lightweight model:"Bar chart showing Q1-Q4 revenue growth, with Q4 reaching approximately $2.5M"
Good for making charts searchable in RAG applications. Fast and cheap, but doesn’t extract actual numbers.
Enhanced: Figure Scope
Thefigure scope uses more powerful models and classifies figures before processing:
- Classifies whether the image is a chart or general figure
- If chart: runs structured extraction to pull data as text
- If not a chart: generates a detailed description using a more powerful model
Advanced: Chart Agent Pipeline
For precise numerical extraction, enableadvanced_chart_agent:
How the Pipeline Works
The chart agent runs multiple parallel tasks, then combines results: Stage 1: Parallel extraction- Component detection: Identifies each data series (lines, bars, areas, scatter points) and their colors/styles
- OCR: Detects all text (axis labels, titles, legends, tick values)
- Legend detection: Maps colors to series labels
- Coordinate extraction: Finds axis boundaries and tick positions
- Masking: Isolates each component by color/style for individual processing
- Axis functions: Builds mathematical functions to convert pixel coordinates to actual values (handles linear, logarithmic, and time series axes)
- Tick alignment: Maps detected points to axis tick values
- Converts pixel coordinates to actual (x, y) values using the axis functions
- Falls back to a VLM for components that couldn’t be processed deterministically
- Outputs a consolidated markdown table
Output Format
Data is returned as a markdown table with the X-axis as rows and each component as a column:(bottom, top).
Supported Chart Types
| Chart Type | Support Level | Notes |
|---|---|---|
| Vertical bar charts | ✅ Full | Detects bar heights and x-axis categories |
| Line charts | ✅ Full | Tracks points along each series |
| Area charts | ✅ Full | Extracts top/bottom boundaries |
| Scatter plots | ✅ Partial | Works for sparse plots; very dense plots may fail |
| Combination charts | ✅ Full | Handles mixed bar/line/area in same chart |
| Time series | ✅ Full | Supports YYYY, YYYY-MM, YYYY-MM-DD formats |
| Logarithmic axes | ✅ Full | Correctly interprets log-scale values |
| Dual Y-axis | ✅ Full | Maps components to primary or secondary axis |
Not Supported
The advanced pipeline will skip these chart types (falls back to VLM description):- Horizontal bar charts: Axis orientation not supported
- Pie charts: No coordinate-based extraction possible
- Radar/spider charts: Non-Cartesian coordinate system
- Density plots: Continuous distributions don’t map to discrete points
- Flow charts/diagrams: Not data visualizations
- Multiple charts in one image: Requires a single chart per figure
- Charts with data labels: If values are already printed on each point, extraction is skipped (the data is already visible)
Custom Prompts
Guide figure processing with custom instructions:Combining with Other Scopes
For documents with charts and complex tables:Limitations
- Resolution matters: Higher quality source images produce more accurate extractions
- Processing time: The advanced pipeline is significantly slower than basic summarization. For async calls, use
priority=Trueto speed up processing. - Dense charts: Scatter plots with many overlapping points may have reduced accuracy
- Same-color styles: Charts where solid and dashed lines share the same color can confuse component detection