Be Specific with Criteria
Criteria should describe concrete, observable characteristics of the document, things that would be visible on the page. Good criteria describe what youâd actually see in the document:Make Categories Mutually Exclusive
Design your categories so that a given document clearly belongs to one category over others. If categories overlap significantly, classification accuracy will suffer.Text vs. Image-Based Classification
Classify works with both text-heavy and visually distinct documents. Your criteria can reference either textual content or visual characteristics:- Text-based criteria:
"contains the words 'Terms and Conditions'","includes a table of financial figures" - Visual/structural criteria:
"has a photo ID section","contains handwritten notes","includes a signature block"
"contains a machine-readable zone at the bottom" or "has a photo in the upper-left corner".
Use Enough Categories
You must provide at least two categories. Classify returns the best match from your schema, so even if none of the categories are a perfect fit, it will return the closest one. If you need an escape hatch, add a catch-all category:Use Confidence Scores to Refine Your Schema
The response confidence breakdown tells you exactly which criteria matched or didnât for every category. Use this to iterate on your schema:- Run Classify on a batch of sample documents.
- Check documents where the winning category had low confidence (e.g., below
0.7). - Inspect the
criteria_confidenceto see which criteria are too broad, too narrow, or overlapping with other categories. - Adjust your criteria and re-run until confidence scores improve.
Related
Classify Overview
Quick start, request parameters, and pipeline integration.
Response Format
Confidence scores, per-criterion reasoning, and all response fields.