Sample Documents
- ID Card
- Utility Bill
- W-9 Form

- Name: IMA CARDHOLDER
- Address: 2570 24TH STREET, ANYTOWN, CA 95818
- DOB: 08/31/1977
- DL Number: 11234568
Download samples: id-card.png | utility-bill.pdf | w9-form.pdf
- Name case: βIMA CARDHOLDERβ (ID) vs βIma Cardholderβ (W-9)
- City spelling: βANYTOWNβ (ID) vs βAndytownβ (utility bill)
- Street format: β24TH STREETβ vs β24th Streetβ
Create API Key
Open Studio
Go to studio.reducto.ai and sign in. From the home page, click API Keys in the left sidebar.

View API Keys
The API Keys page shows your existing keys. Click + Create new API key in the top right corner.

Configure Key
In the modal, enter a name for your key and set an expiration policy (or select βNeverβ for no expiration). Click Create.

Verification Workflow
Step 1: Define Extraction Schemas
Each document type needs a tailored schema. The key is writing good field descriptions that tell the LLM where to find each value.ID Card Schema
Government IDs have structured layouts with clear field labels. We extract both identity fields and the IDβs validity period.full_nameandfirst_name/last_name: Extract both because other documents may format names differentlydate_of_birthformat: Request YYYY-MM-DD for consistent date handling in codeexpiration_date: Critical for checking if the ID is still valid
Utility Bill Schema
Utility bills prove current address. They vary more in layout than IDs, so field descriptions need to be more specific about what to extract.account_holder: This is what we match against the ID nameservice_address(not mailing address): The service address proves residencestatement_date: Bills must be recent (typically within 90 days)
W-9 Tax Form Schema
W-9s have a fixed IRS layout. Field descriptions reference specific line numbers to help the LLM locate values.city_state_zipas one field: W-9 Line 6 combines these, so we extract them together and parse later- Line number references: βLine 1β, βLine 5β, βLine 6β help the LLM find the right fields on the standardized IRS form
Step 2: Extract from All Documents
Upload each document and run extraction with the appropriate schema. Reducto handles both image files (ID card) and PDFs (utility bill, W-9) with the same API.Extraction Results
From our sample documents:- Name: βIMA CARDHOLDERβ vs βIma Cardholderβ (case difference)
- City: βANYTOWNβ vs βAndytownβ (case + typo)
- Street: β24TH STREETβ vs β24th Streetβ (case + abbreviation)
Step 3: Normalize and Compare
Extracted data wonβt match exactly across documents. Hereβs what we see:| Field | ID Card | Utility Bill | W-9 |
|---|---|---|---|
| Name | IMA CARDHOLDER | IMA CARDHOLDER | Ima Cardholder |
| City | ANYTOWN | Andytown | Andytown |
| Street | 2570 24TH STREET | 2570 24th Street | 2570 24th Street |
Normalization Functions
Normalization standardizes these variations:- Uppercase everything
- Convert abbreviations (βSTREETβ β βSTβ)
- Remove punctuation
- Collapse extra whitespace
- βIMA CARDHOLDERβ β βIMA CARDHOLDERβ
- βIma Cardholderβ β βIMA CARDHOLDERβ β Match!
- β2570 24TH STREETβ β β2570 24TH STβ
- β2570 24th Streetβ β β2570 24TH STβ β Match!
Why Fuzzy Matching?
Even after normalization, OCR errors and typos happen. βANYTOWNβ vs βANDYTOWNβ is a single character difference. Itβs likely the same city, not a fraudulent mismatch. Fuzzy matching with an 85% similarity threshold catches these while rejecting genuine mismatches:Step 4: Verification Strategy
Our verification uses two tiers of checks: Critical checks (must pass):- Name match - Name must match across all three documents
- Address match - Address must match (street, state, ZIP)
