Page Range
By default, Classify uses the first 5 pages of a document as context for classification. For most documents, the first few pages contain enough information to determine document type (cover pages, headers, introductory sections). You can increase context up to 10 pages using thepage_range parameter when distinguishing content appears deeper in the document.
- Page numbers are 1-indexed (first page is page 1).
- Both
startandendare inclusive. - If no
page_rangeis specified, the first 5 pages are used. - If more than 10 pages are selected, the request returns an error.
- Only applies to PDFs. Ignored for other document types.
Each page of context costs 0.5 credits. Using the default 5 pages costs 2.5 credits per classification. Increasing to 10 pages costs 5.0 credits. Only increase when the default pages donât contain enough distinguishing content. See Credit Usage for details.
Classification Schema
Theclassification_schema parameter defines what categories Classify can return. Each category needs a name and a list of criteria.
Writing effective criteria
Criteria are natural language descriptions that tell the model what to look for. More specific criteria produce better results. Good criteria describe observable features:- âContains a table of itemized charges with quantities and unit pricesâ
- âIncludes signature blocks for multiple partiesâ
- âHas a header with âINVOICEâ or invoice numberâ
- âBusiness documentâ
- âHas textâ
- âContains informationâ
Example: Financial document routing
Related
Classify Overview
Introduction to document classification.
Chaining API Calls
Route classified documents to Parse and Extract.
Credit Usage
Classification pricing details.