Pipeline Basics

A pipeline bundles multiple Reducto steps into a single workflow. You design it in Studio, deploy it to get a pipeline_id, then call it from your code with one API request. The pipeline handles the orchestration of Parse, Extract, Split, and Edit steps behind the scenes.

Why pipelines?

Without pipelines, building a multi-step document workflow means writing separate API calls for each step. You call Parse, wait for the result, feed that into Extract, handle errors at each stage, and manage all the configuration in your application code. This works, but it couples your code tightly to Reducto’s API structure and makes configuration changes require code deployments. Pipelines solve this by moving the workflow definition out of your code and into Studio. You configure the steps visually, test with real documents, and deploy. Your code then reduces to a single call:

result = client.pipeline.run(
    input=upload,
    pipeline_id="k9798h9mwt0wmq5qz5e45qxbfx7yj4bq"
)

When you need to adjust extraction logic or add a processing step, you update the pipeline in Studio and redeploy. Your application code stays unchanged.

Creating a pipeline in Studio

Build your pipeline in Studio by adding steps and configuring each one. The Studio guides cover each step type in detail:

Parse for document conversion
Extract for structured data extraction
Split for document sectioning
Edit for form filling

Once you’re satisfied with the results, click Deploy in the top right. Select Pipeline as the deployment type and optionally provide a version name to track your changes.

Studio generates a pipeline_id that you can copy directly into your code. This ID points to your exact configuration, so API calls always match what you tested in Studio.

Changes made in Studio don’t affect production until you deploy. This lets you iterate and test without impacting live systems.

Updating a pipeline

When you need to modify a deployed pipeline, make your changes in Studio and test with sample documents. Then click Deploy, select Pipeline, update the version name, and click Redeploy. The update takes effect immediately, and all API calls using that pipeline_id will use the new configuration. The Config History tab shows all previous versions of your pipeline, letting you track what changed and when:

The Logs tab shows execution logs for monitoring how your pipeline performs in production:

For more details on pipeline management, see Deploy to Production.

Pipeline types

Studio determines the pipeline type based on which steps you add:

Type	Steps	Use case
Parse	Parse only	Convert documents to markdown, chunk for RAG
Parse → Extract	Parse + Extract	Pull specific fields as JSON
Parse → Split → Extract	Parse + Split + Extract(s)	Different schemas per document section
Edit	Edit only	Fill forms, modify documents

Basic usage

from pathlib import Path
from reducto import Reducto

client = Reducto()

# Upload and run pipeline in one flow
upload = client.upload(file=Path("document.pdf"))
result = client.pipeline.run(
    input=upload.file_id,
    pipeline_id="your_pipeline_id"
)

# Access results based on pipeline type
if result.result.extract:
    # For Parse→Split→Extract pipelines, extract is a list
    if isinstance(result.result.extract, list):
        for section in result.result.extract:
            print(f"{section.split_name}: {section.result}")
    else:
        # For Parse→Extract pipelines, extract is an object
        print(result.result.extract.result)
elif result.result.parse:
    for chunk in result.result.parse.result.chunks:
        print(chunk.content)

The Go SDK does not yet support the Pipeline endpoint. Use the HTTP example above or the cURL snippet as a reference for Go implementations.

Response structure

Every pipeline returns a PipelineResponse with the same shape. Which fields are populated depends on the pipeline type you configured in Studio.

{
  "job_id": "pipeline-abc123",
  "usage": {"num_pages": 3, "credits": 6.0},
  "result": {
    "parse": {...},
    "extract": {...},
    "split": {...},
    "edit": {...}
  }
}

The parse field is present for Parse, Parse→Extract, and Parse→Split→Extract pipelines. The extract field appears as an object for Parse→Extract pipelines, or as an array for Parse→Split→Extract pipelines where each entry corresponds to a section. The split field only appears when Split is configured. The edit field only appears for Edit pipelines.

Parse pipeline response

{
  "job_id": "abc123",
  "usage": {"num_pages": 3, "credits": 4.0},
  "result": {
    "parse": {
      "job_id": "parse-456",
      "result": {
        "chunks": [
          {"content": "# Title\n\nParagraph text...", "blocks": [...]}
        ]
      },
      "usage": {"num_pages": 3, "credits": 4.0}
    },
    "extract": null,
    "split": null
  }
}

Parse → Extract pipeline response

{
  "job_id": "abc123",
  "usage": {"num_pages": 3, "credits": 6.0},
  "result": {
    "parse": {"job_id": "parse-456", "result": {...}, "usage": {...}},
    "extract": {
      "job_id": "extract-789",
      "result": {
        "invoiceNumber": {"value": "INV-001", "citations": [...]},
        "totalAmount": {"value": "$1,500.00", "citations": [...]}
      },
      "usage": {"credits": 2.0}
    },
    "split": null
  }
}

Parse → Split → Extract pipeline response

When Split is involved, extract becomes an array with one entry per section:

{
  "job_id": "abc123",
  "usage": {"num_pages": 10, "credits": 12.0},
  "result": {
    "parse": {...},
    "split": {
      "result": {
        "splits": [
          {"name": "Summary", "pages": [1, 2]},
          {"name": "Details", "pages": [3, 4, 5]}
        ]
      }
    },
    "extract": [
      {
        "split_name": "Summary",
        "page_range": [1, 2],
        "result": {"totalValue": {"value": "$274,222", "citations": [...]}}
      },
      {
        "split_name": "Details",
        "page_range": [3, 4, 5],
        "result": {"holdings": [...]}
      }
    ]
  }
}

Edit pipeline response

Edit pipelines return a URL to the modified document. Note that for edit pipelines, the input parameter contains the edit instructions rather than a document URL. The document to edit is configured in Studio as part of the pipeline.

{
  "job_id": "abc123",
  "usage": {"num_pages": 0, "credits": null},
  "result": {
    "parse": null,
    "extract": null,
    "split": null,
    "edit": {
      "job_id": "edit-999",
      "result": {"file_url": "https://storage.reducto.ai/edited-doc.pdf"}
    }
  }
}

Next steps

Studio Quickstart

Build and deploy your first pipeline in Studio.

Multi-document Pipelines

Process multiple documents in a single call.

Deploy to Production

Manage pipeline versions and view execution logs.

Async Processing

Run pipelines asynchronously with webhooks.

Get Started

Core Functions

Workflows and Pipelines

Configurations

Reference

Components

Enterprise Resources

Security and privacy

On-premise Resources

Why pipelines?

Creating a pipeline in Studio

Updating a pipeline

Pipeline types

Basic usage

Response structure

Next steps

Studio Quickstart

Multi-document Pipelines

Deploy to Production

Async Processing

Get Started

Core Functions

Workflows and Pipelines

Configurations

Reference

Components

Enterprise Resources

Security and privacy

On-premise Resources

​Why pipelines?

​Creating a pipeline in Studio

​Updating a pipeline

​Pipeline types

​Basic usage

​Response structure

​Next steps

Studio Quickstart

Multi-document Pipelines

Deploy to Production

Async Processing

Why pipelines?

Creating a pipeline in Studio

Updating a pipeline

Pipeline types

Basic usage

Response structure

Next steps