Chart data extraction pipeline

Our chart extraction feature uses a multi-stage pipeline that combines OCR with vision-language models to extract structured data from chart images. The system processes diverse chart types through adaptive workflows, delivering accurate data extraction for enterprise analytics workflows.

Architecture overview

The extraction pipeline consists of three primary stages:

Structural Analysis - OCR-based text detection and layout understanding
Coordinate Extraction - Key point detection and spatial mapping
Semantic Correspondence - Vision-language model validation and data mapping

Stage 1: Structural analysis

The process starts with OCR to detect and segment text elements—axis labels, titles, legends, and data annotations. The system analyzes the spatial positioning of these elements to understand the chart’s structure and establish coordinate system boundaries. Based on this layout analysis, we segment the chart into regions and identify the primary visualization area. OCR Text Detection and Segmentation

Key outputs from this stage include:

Bounding boxes for all text elements
Axis orientation and scale detection
Chart type classification
Coordinate system boundaries

Stage 2: Coordinate extraction

Following structural analysis, the system performs key point detection to extract precise coordinates for data elements. This includes:

Bar heights and positions in bar charts
Line segment endpoints and inflection points
Pie slice boundaries and centroids
Scatter plot point locations

Key Point Detection and Coordinate Extraction

The coordinate extraction module uses computer vision techniques to identify these elements within the segmented regions established in Stage 1. Each detected point is stored with its pixel coordinates and confidence score.

Stage 3: Semantic correspondence

A fine-tuned vision-language model processes the extracted coordinates using mark prompting techniques. This step establishes the correspondence between detected key points and their associated data labels, ensuring accurate mapping between visual elements and their semantic meaning. The model handles common challenges including:

Ambiguous label-to-data associations
Overlapping or clustered data points
Irregular label positioning
Multi-series data disambiguation

After validation, the system transforms the coordinates from pixel space to actual data values using the established axis scales from Stage 1.

Adaptive processing by chart type

The pipeline adapts its processing strategy based on the detected chart type: Bar and Line Charts: The coordinate extraction and mapping stages handle most processing, with the VLM primarily validating correspondences. The system leverages the regular structure of these charts for efficient processing. Pie Charts: The vision-language model takes on additional responsibilities, directly interpreting angular relationships and percentage allocations that would be difficult to capture through coordinate analysis alone. Complex Visualizations: For stacked charts, heatmaps, and other complex formats, the pipeline dynamically adjusts the balance between rule-based extraction and model-based interpretation.

Output format

The pipeline produces structured data in a tabular markdown table format. Each extraction includes:

Column headers mapped from axis labels and legends
Row data containing the extracted values
Metadata including chart title and data source when available

Performance characteristics

The system maintains high accuracy across diverse chart formats encountered in enterprise environments. Processing time varies by chart complexity. The pipeline includes built-in validation steps to ensure data quality and consistency across the extraction process.

Get Started

Examples

Core Functions

Configurations

FAQ

Security and privacy

On-premise deployment

Chart data extraction pipeline

Architecture overview

Stage 1: Structural analysis

Stage 2: Coordinate extraction

Stage 3: Semantic correspondence

Adaptive processing by chart type

Output format

Performance characteristics

Get Started

Examples

Core Functions

Configurations

FAQ

Security and privacy

On-premise deployment

​Architecture overview

​Stage 1: Structural analysis

​Stage 2: Coordinate extraction

​Stage 3: Semantic correspondence

​Adaptive processing by chart type

​Output format

​Performance characteristics

Architecture overview

Stage 1: Structural analysis

Stage 2: Coordinate extraction

Stage 3: Semantic correspondence

Adaptive processing by chart type

Output format

Performance characteristics