Skip to main content
Form schemas let you define exactly where form fields are located in a PDF, what type they are, and how they should be filled. Instead of relying on Edit to detect fields each time, you provide the field definitions upfront. This matters for two reasons: speed and consistency. With a form schema, Edit skips field detection and description generation, processing forms significantly faster. And because the same fields are targeted every time, you get deterministic results across thousands of form fills.

The Workflow

  1. Run Edit once without a form_schema to let Reducto detect all fields
  2. Save the returned form_schema from the response
  3. Use that schema for subsequent fills of the same form type
from reducto import Reducto
import json

client = Reducto()

# First run: let Edit detect fields
result = client.edit.run(
    document_url="https://example.com/w9-blank.pdf",
    edit_instructions="Fill with: Name: Test Corp, EIN: 12-3456789"
)

# Save the detected schema (convert Pydantic objects to dicts)
schema_dicts = [field.model_dump() for field in result.form_schema]
with open("w9_schema.json", "w") as f:
    json.dump(schema_dicts, f)

# Subsequent runs: much faster with saved schema
with open("w9_schema.json") as f:
    saved_schema = json.load(f)

result = client.edit.run(
    document_url="https://example.com/w9-blank.pdf",
    edit_instructions="Fill with: Name: Acme Inc, EIN: 98-7654321",
    form_schema=saved_schema
)
The first call runs field detection and generates descriptions. Subsequent calls with the schema skip those steps entirely.
ScenarioPipeline
Without form_schemaDetection β†’ Context β†’ Descriptions β†’ Fill
With form_schemaFill only

Schema Structure

A form schema is an array of field definitions:
form_schema = [
    {
        "bbox": {
            "left": 0.227,      # Distance from left edge (0-1)
            "top": 0.144,       # Distance from top edge (0-1)
            "width": 0.15,      # Width as fraction of page
            "height": 0.025,    # Height as fraction of page
            "page": 1           # Page number (1-indexed)
        },
        "description": "Bank Routing/ABA Number",
        "type": "text",
        "fill": True,           # Let LLM determine value (default)
        "value": None           # No fixed value
    },
    {
        "bbox": {"left": 0.432, "top": 0.144, "width": 0.2, "height": 0.025, "page": 1},
        "description": "Bank Name",
        "type": "text",
        "value": "Wells Fargo"  # Fixed value bypasses LLM
    },
    {
        "bbox": {"left": 0.227, "top": 0.54, "width": 0.02, "height": 0.02, "page": 1},
        "description": "Domestic wire checkbox",
        "type": "checkbox"
    }
]

Field Properties

PropertyTypeRequiredDescription
bboxobjectYesNormalized coordinates (0-1 range from top-left)
descriptionstringYesUsed by LLM to map instructions to this field
typestringYestext, checkbox, dropdown, or barcode
fillbooleanNoWhether to fill this field (default: true)
valuestringNoFixed value that bypasses LLM

Bounding Box

Coordinates are normalized (0-1), measured from the top-left corner:
(0,0) ─────────────────────── (1,0)
  β”‚                              β”‚
  β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”               β”‚
  β”‚    β”‚  Field  β”‚ left: 0.1     β”‚
  β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ top: 0.2      β”‚
  β”‚                              β”‚
(0,1) ─────────────────────── (1,1)
Page numbers are 1-indexed. The first page is page: 1.

Fill Control

The fill and value properties control how each field is handled:
fillvalueBehavior
true (default)nullLLM determines value from instructions
true"something"Uses this exact value, ignores instructions
falseanyField left empty
Use value for fields that should always contain the same thing (form version, tax year). Use fill: false for fields that should stay blank (signature boxes).

Widget Types

Text

Standard text input fields. The LLM extracts the relevant value from your instructions based on the field’s description.
{
    "type": "text",
    "description": "Social Security Number in XXX-XX-XXXX format",
    "bbox": {"left": 0.55, "top": 0.72, "width": 0.4, "height": 0.03, "page": 1}
}
Include format expectations in the description. β€œSSN in XXX-XX-XXXX format” produces better results than just β€œSSN” because the LLM knows how to format the output.

Checkbox

Boolean fields that get checked or unchecked. The LLM interprets your instructions to determine whether the box should be checked.
{
    "type": "checkbox",
    "description": "US Citizen - Yes",
    "bbox": {"left": 0.15, "top": 0.35, "width": 0.02, "height": 0.02, "page": 1}
}
Checkbox bounding boxes should be small and roughly square. Make descriptions explicit about what checking means: β€œUS Citizen - Yes” is clearer than β€œCitizenship” when there are Yes/No checkbox pairs. Selection fields with predefined options. The LLM suggests a value, and Edit selects the matching option from the PDF’s dropdown.
{
    "type": "dropdown",
    "description": "State of incorporation",
    "bbox": {"left": 0.3, "top": 0.4, "width": 0.2, "height": 0.03, "page": 1}
}
The value must exactly match an available option. If your instructions say β€œCA” but the dropdown only contains β€œCalifornia”, the field is skipped silently. Consider listing options in the description: β€œState (CA, NY, TX, …)” to help the LLM match correctly.

Barcode

Special fields for barcode data. These are typically detected automatically in forms that have barcode regions and filled with encoded data.
{
    "type": "barcode",
    "description": "Document tracking code",
    "bbox": {"left": 0.7, "top": 0.9, "width": 0.25, "height": 0.05, "page": 1}
}

Troubleshooting

Coordinates are from the top-left (0,0), with Y increasing downward. If you measured from bottom-left, flip the Y values: correct_top = 1 - your_top - heightStart with a single field, verify it works, then add more incrementally.
Edit matches schema fields to existing widgets using bounding box overlap. If overlap is less than 50%, a new widget is created.Run Edit without a schema first to see actual widget positions, then adjust your coordinates to match.
  1. Ensure type is "checkbox", not "text"
  2. Bounding boxes should be small and square
  3. Be explicit in instructions: β€œCheck the β€˜Yes’ checkbox for US citizenship”
Check for:
  • fill: false on the field
  • Dropdown value not matching available options exactly
  • Instructions don’t mention data for this field
  • Unsupported widget types (signatures, images)