Sample Documents
- ID Card
- Utility Bill
- W-9 Form

- Name: IMA CARDHOLDER
- Address: 2570 24TH STREET, ANYTOWN, CA 95818
- DOB: 08/31/1977
- DL Number: 11234568
Download samples: id-card.png | utility-bill.pdf | w9-form.pdf
- Name case: âIMA CARDHOLDERâ (ID) vs âIma Cardholderâ (W-9)
- City spelling: âANYTOWNâ (ID) vs âAndytownâ (utility bill)
- Street format: â24TH STREETâ vs â24th Streetâ
Create API Key
Open Studio
Go to studio.reducto.ai and sign in. From the home page, click API Keys in the left sidebar.

View API Keys
The API Keys page shows your existing keys. Click + Create new API key in the top right corner.

Configure Key
In the modal, enter a name for your key and set an expiration policy (or select âNeverâ for no expiration). Click Create.

Verification Workflow
Step 1: Define Extraction Schemas
Each document type needs a tailored schema. The key is writing good field descriptions that tell the LLM where to find each value.ID Card Schema
Government IDs have structured layouts with clear field labels. We extract both identity fields and the IDâs validity period.full_nameandfirst_name/last_name: Extract both because other documents may format names differentlydate_of_birthformat: Request YYYY-MM-DD for consistent date handling in codeexpiration_date: Critical for checking if the ID is still valid
Utility Bill Schema
Utility bills prove current address. They vary more in layout than IDs, so field descriptions need to be more specific about what to extract.account_holder: This is what we match against the ID nameservice_address(not mailing address): The service address proves residencestatement_date: Bills must be recent (typically within 90 days)
W-9 Tax Form Schema
W-9s have a fixed IRS layout. Field descriptions reference specific line numbers to help the LLM locate values.city_state_zipas one field: W-9 Line 6 combines these, so we extract them together and parse later- Line number references: âLine 1â, âLine 5â, âLine 6â help the LLM find the right fields on the standardized IRS form
Step 2: Extract from All Documents
Upload each document and run extraction with the appropriate schema. Reducto handles both image files (ID card) and PDFs (utility bill, W-9) with the same API.Extraction Results
From our sample documents:- Name: âIMA CARDHOLDERâ vs âIma Cardholderâ (case difference)
- City: âANYTOWNâ vs âAndytownâ (case + typo)
- Street: â24TH STREETâ vs â24th Streetâ (case + abbreviation)
Step 3: Normalize and Compare
Extracted data wonât match exactly across documents. Hereâs what we see:| Field | ID Card | Utility Bill | W-9 |
|---|---|---|---|
| Name | IMA CARDHOLDER | IMA CARDHOLDER | Ima Cardholder |
| City | ANYTOWN | Andytown | Andytown |
| Street | 2570 24TH STREET | 2570 24th Street | 2570 24th Street |
Normalization Functions
Normalization standardizes these variations:- Uppercase everything
- Convert abbreviations (âSTREETâ â âSTâ)
- Remove punctuation
- Collapse extra whitespace
- âIMA CARDHOLDERâ â âIMA CARDHOLDERâ
- âIma Cardholderâ â âIMA CARDHOLDERâ â Match!
- â2570 24TH STREETâ â â2570 24TH STâ
- â2570 24th Streetâ â â2570 24TH STâ â Match!
Why Fuzzy Matching?
Even after normalization, OCR errors and typos happen. âANYTOWNâ vs âANDYTOWNâ is a single character difference. Itâs likely the same city, not a fraudulent mismatch. Fuzzy matching with an 85% similarity threshold catches these while rejecting genuine mismatches:Step 4: Verification Strategy
Our verification uses two tiers of checks: Critical checks (must pass):- Name match - Name must match across all three documents
- Address match - Address must match (street, state, ZIP)
