🟣 Technical 8 min read

MLLMs for Chart and Data Understanding: Reading Graphs Like a Human

Multimodal LLMs can now read charts, extract data from graphs, and answer questions about visualizations. Here's how well they actually work, where they fail, and how to use them effectively.

View all mllms depths β†’

Someone sends you a screenshot of a dashboard with 6 charts. You need to extract the key numbers, identify trends, and write a summary. A year ago, you’d do this manually. Today, multimodal LLMs can do it β€” with caveats.

What MLLMs Can Actually Do with Charts

The Good

Trend identification β€” β€œSales are trending upward in Q3 and Q4” β€” models get this right 90%+ of the time. They understand the visual language of line charts, bar charts, and most common visualization types.

Relative comparisons β€” β€œCategory A is significantly larger than Category B” β€” reliable for obvious visual differences. Models can rank bars by height, compare line slopes, and identify the largest slice in a pie chart.

Chart type recognition β€” Models correctly identify bar charts, line charts, scatter plots, pie charts, heatmaps, and most standard visualization types.

Qualitative reading β€” β€œRevenue peaked in July then declined sharply” β€” models extract narrative-level insights well, similar to how a non-expert human would describe a chart.

The Unreliable

Exact value extraction β€” β€œWhat is the exact value for March?” β€” models frequently get this wrong, especially when the chart lacks data labels. They estimate values from axis positions but with significant error (Β±10–20% is common).

Dense charts β€” Charts with many overlapping lines, small labels, or tight spacing confuse models. A line chart with 3 lines is fine; one with 12 lines produces errors.

Axis reading β€” Models sometimes misread axis scales, especially logarithmic axes, dual axes, or axes that don’t start at zero. This is a major source of incorrect numerical claims.

Small differences β€” β€œIs the Q2 bar slightly taller than Q1?” β€” when differences are subtle (less than ~15% visually), models are unreliable.

Model Comparison (March 2026)

CapabilityGPT-5 VisionClaude 4Gemini 2.5 Ultra
Trend identificationExcellentExcellentExcellent
Value extractionGoodGoodVery Good
Complex chartsGoodVery GoodGood
Chart reasoningVery GoodExcellentGood
Table extractionExcellentExcellentVery Good

Gemini’s edge on value extraction comes from its training data β€” Google has extensive chart/data pairs. Claude tends to be more cautious, hedging when it’s uncertain rather than guessing (which is actually preferable for accuracy-critical applications).

Prompting for Chart Understanding

Basic: Get the Facts

Look at this chart and tell me:
1. What type of chart is this?
2. What are the axes?
3. What are the approximate values for each data point?
4. What is the main trend or insight?

Better: Structured Extraction

Extract data from this chart into a markdown table.
For each data point, provide:
- Category/date label (from x-axis)
- Value (estimated from y-axis)
- Confidence: HIGH if the value is labeled, MEDIUM if clearly readable
  from the axis, LOW if estimated

Format as a table. If you're uncertain about any value, say so.

Best: Task-Specific with Context

This is a monthly revenue chart for our SaaS product.
I need you to:
1. Extract monthly values into a table (estimate is fine, note uncertainty)
2. Calculate month-over-month growth rates
3. Identify any anomalies (months that deviate from the trend)
4. Summarize in 2-3 sentences for an executive audience

Our fiscal year starts in April. Revenue should be in the $2-5M range
per month based on our business size.

Providing context (expected ranges, domain) helps the model calibrate its value estimates and catch its own errors.

Building a Chart Analysis Pipeline

For automated chart analysis at scale:

async def analyze_chart(image_path, chart_context=""):
    # Step 1: Extract raw data
    extraction_prompt = f"""
    Extract all data from this chart into structured format.
    Context: {chart_context}
    Return as JSON with fields: chart_type, title, x_axis, y_axis,
    data_points (list of {{label, value, confidence}})
    """
    raw_data = await call_mllm(image_path, extraction_prompt)

    # Step 2: Validate extraction
    validation_prompt = f"""
    I extracted this data from a chart: {raw_data}
    Looking at the chart again, check:
    - Are the values reasonable given the axis scale?
    - Did I miss any data points?
    - Are there any obvious errors?
    Return corrected data if needed.
    """
    validated_data = await call_mllm(image_path, validation_prompt)

    # Step 3: Generate insights
    insight_prompt = f"""
    Based on this extracted data: {validated_data}
    Provide 3-5 key insights. Focus on trends, anomalies,
    and actionable observations.
    """
    insights = await call_mllm(image_path, insight_prompt)

    return {"data": validated_data, "insights": insights}

The two-pass approach (extract, then validate) significantly improves accuracy. The model catches many of its own errors when explicitly asked to double-check.

Handling Complex Dashboards

Real-world dashboards contain multiple charts, KPI cards, tables, and text. For these:

Crop individual charts. Processing one chart at a time is more accurate than asking the model to interpret an entire dashboard. Use simple image cropping to isolate each visualization.

Provide layout context. β€œThis is the top-left chart from a sales dashboard. The dashboard covers Q1 2026 performance.” Layout context helps the model interpret ambiguous labels.

Process KPI cards separately. Text-heavy elements (big numbers, metric cards) are easier for models to read accurately. Extract these first to establish baseline numbers that inform chart interpretation.

Limitations to Remember

  1. Don’t trust exact numbers unless the chart has data labels. Always verify critical values.
  2. Color-coded information can be unreliable β€” models sometimes confuse similar colors in legends.
  3. 3D charts are harder for models (and humans). Ask the chart creator for a 2D version.
  4. Interactive charts in screenshots lose hover/tooltip information. Screenshot with tooltips visible if possible.
  5. Low resolution kills accuracy. Use high-resolution screenshots (2x or retina resolution).

When to Use MLLMs vs. Structured Data

If you have access to the underlying data (CSV, database), always use that for exact analysis. MLLMs reading charts are estimating from pixels β€” useful when you only have the image, but inferior to actual data.

The sweet spot for MLLM chart analysis: processing screenshots from reports, slides, and dashboards where you don’t have access to the source data. For your own charts, query the data directly.

Simplify

← Benchmarking Multimodal LLMs: What to Measure and How

Go deeper

MLLMs for Code and Visual Reasoning: When Models Read Diagrams, Screenshots, and Whiteboards β†’

Related reads

mllmschartsdata-visualizationvisionanalysis

Stay ahead of the AI curve

Weekly insights on AI β€” explained at the level that's right for you. No hype, no jargon, just what matters.

No spam. Unsubscribe anytime. We respect your inbox.