Chaining API Calls

When you call Parse, Reducto returns a job_id that represents the parsed document. You can pass this job ID to subsequent Extract or Split calls using the jobid:// prefix, which skips re-parsing and uses the cached result. This saves both time and credits when you need to run multiple operations on the same document.

The jobid:// protocol

After parsing a document, the response includes a job_id:

from pathlib import Path

upload = client.upload(file=Path("document.pdf"))
parse_result = client.parse.run(input=upload)
print(parse_result.job_id)  # "7600c8c5-a52f-49d2-8a7d-d75d1b51e141"

To reuse this parsed content in Extract or Split, prefix the job ID with jobid://:

# Extract using the parsed document (no re-parsing)
extract_result = client.extract.run(
    input=f"jobid://{parse_result.job_id}",
    instructions={"schema": your_schema}
)

When Reducto sees jobid://, it retrieves the cached parse result instead of processing the document again. Any parsing options you include in the request are ignored since the document was already parsed.

Common chaining patterns

Parse → Extract

The most common pattern. Parse once, then run one or more extractions with different schemas:

from pathlib import Path
from reducto import Reducto

client = Reducto()

# Step 1: Upload and parse the document
upload = client.upload(file=Path("financial-report.pdf"))
parse_result = client.parse.run(input=upload)
job_id = parse_result.job_id

# Step 2: Extract summary metrics
summary = client.extract.run(
    input=f"jobid://{job_id}",
    instructions={"schema": {
        "type": "object",
        "properties": {
            "total_revenue": {"type": "number"},
            "net_income": {"type": "number"}
        }
    }}
)

# Step 3: Extract detailed line items (same parsed document)
line_items = client.extract.run(
    input=f"jobid://{job_id}",
    instructions={"schema": {
        "type": "object",
        "properties": {
            "expenses": {"type": "array", "items": {"type": "object"}}
        }
    }},
    settings={"array_extract": True}
)

Without chaining, each Extract call would re-parse the document. With chaining, you parse once and pay for parsing credits once.

Parse → Split → Extract

For documents with distinct sections that need different extraction schemas:

from pathlib import Path

# Step 1: Upload and parse
upload = client.upload(file=Path("contract.pdf"))
parse_result = client.parse.run(input=upload)
job_id = parse_result.job_id

# Step 2: Split into sections
split_result = client.split.run(
    input=f"jobid://{job_id}",
    split_description=[
        {"name": "Terms", "description": "Terms and conditions section"},
        {"name": "Pricing", "description": "Pricing and payment terms"},
        {"name": "SLA", "description": "Service level agreement"}
    ]
)

# Step 3: Extract from specific sections
for section in split_result.result.splits:
    if section.pages:
        extract_result = client.extract.run(
            input=f"jobid://{job_id}",
            instructions={"schema": get_schema_for_section(section.name)},
            parsing={"settings": {"page_range": {
                "start": section.pages[0],
                "end": section.pages[-1]
            }}}
        )

Parse → Classify → Extract

When you need to determine document type before choosing an extraction schema:

# Step 1: Parse
parse_result = client.parse.run(input=document_url)
job_id = parse_result.job_id

# Step 2: Classify document type
classification = client.extract.run(
    input=f"jobid://{job_id}",
    instructions={"schema": {
        "type": "object",
        "properties": {
            "document_type": {"type": "string", "enum": ["Invoice", "Receipt", "PO", "Other"]}
        }
    }}
)

# Extract result is a list; access the first item
doc_type = classification.result[0]["document_type"]

# Step 3: Extract with type-specific schema
if doc_type == "Invoice":
    schema = invoice_schema
elif doc_type == "Receipt":
    schema = receipt_schema
else:
    schema = generic_schema

result = client.extract.run(
    input=f"jobid://{job_id}",
    instructions={"schema": schema}
)

This pattern is useful when processing mixed document types from a single upload queue.

Multiple job IDs

Extract also accepts a list of job IDs, which combines the parsed content from multiple documents into a single extraction context:

# Parse multiple documents (documents can be URLs or uploaded file IDs)
job_ids = []
for doc in documents:
    result = client.parse.run(input=doc)
    job_ids.append(result.job_id)

# Extract across all documents
combined_result = client.extract.run(
    input=[f"jobid://{jid}" for jid in job_ids],
    instructions={"schema": aggregation_schema}
)

This behaves like multi-document pipelines: the extraction sees all documents together and returns a single result. Design your schema accordingly if you want data from each document.

Supported endpoints

Endpoint	Accepts jobid://	Notes
Parse	Yes	Reprocesses with different settings
Extract	Yes	Single ID or list of IDs
Split	Yes	Single ID only
Edit	No	Requires actual document URL

Credit savings

When you use jobid://, you only pay parse credits once regardless of how many subsequent calls you make:

Without chaining	With chaining
Parse (4 credits)	Parse (4 credits)
Extract #1 (4 + 2 credits)	Extract #1 (2 credits)
Extract #2 (4 + 2 credits)	Extract #2 (2 credits)
Total: 16 credits	Total: 8 credits

The savings scale with document size and number of operations.

Job ID retention

Parse job IDs are retained for 12 hours by default. If you need to chain calls after this window, you’ll need to re-parse the document. For workflows that span longer periods, consider storing the parsed content or using pipelines which handle this automatically.

Pipeline Basics

Bundle multi-step workflows into a single API call.

Split

Divide documents into sections for targeted extraction.

Extract

Pull structured data from parsed documents.

Batch Processing

Process many documents in parallel.

Get Started

Core Functions

Workflows and Pipelines

Configurations

Reference

Components

Enterprise Resources

Security and privacy

On-premise Resources

The jobid:// protocol

Common chaining patterns

Parse → Extract

Parse → Split → Extract

Parse → Classify → Extract

Multiple job IDs

Supported endpoints

Credit savings

Job ID retention

Pipeline Basics

Split

Extract

Batch Processing

Get Started

Core Functions

Workflows and Pipelines

Configurations

Reference

Components

Enterprise Resources

Security and privacy

On-premise Resources

​The jobid:// protocol

​Common chaining patterns

​Parse → Extract

​Parse → Split → Extract

​Parse → Classify → Extract

​Multiple job IDs

​Supported endpoints

​Credit savings

​Job ID retention

​Related

Pipeline Basics

Split

Extract

Batch Processing

The jobid:// protocol

Common chaining patterns

Parse → Extract

Parse → Split → Extract

Parse → Classify → Extract

Multiple job IDs

Supported endpoints

Credit savings

Job ID retention

Related