Skip to main content
The Reducto CLI lets you parse documents, extract structured data, and modify documents directly from your terminal. It’s ideal for batch processing, scripting, and quick document operations.

Installation

Install the Reducto CLI using pip:
pip install reducto-cli

Authentication

Before using the CLI, authenticate by running:
reducto login
This command opens Reducto Studio in your browser, where you can securely authenticate your CLI session.

Quick Examples

# Parse a single file
reducto parse path/to/document.pdf

# Parse an entire folder
reducto parse ./docs

# Extract with a schema (path or inline JSON)
reducto extract ./docs/invoice.pdf -s schemas/invoice.json

# Edit a single file
reducto edit path/to/document.pdf --instructions "Your editing instructions here"
Parsed outputs are written as <filename>.parse.md. Extraction reuses existing parses when possible and saves <filename>.extract.json containing only the payload.

Supported File Types

The CLI supports the same file types as the Reducto API:
FormatExtensions
PDF.pdf
Images.png, .jpg, .jpeg
Office documents.doc, .docx, .ppt, .pptx
Spreadsheets.xls, .xlsx
Commands accept either a file or a directory. Directories are scanned recursively, and only supported file types are processed.

Parse Command

The parse command converts documents into structured markdown output.

Flags

FlagDescription
--agenticEnables all agentic options for tables, text, and figures. Increases accuracy but also increases latency. Use when document quality or complex layouts require enhanced processing.
--change-trackingEnables change tracking during parsing. Returns <s> tags around strikethrough text, <u> tags around underlined text, and <change> tags around colored adjacent strikethrough and underlined text. Useful for documents with revision history.
--highlightsInclude highlighted text in the parsed output.
--hyperlinksInclude embedded hyperlinks in the parsed output.
--commentsInclude document comments in the parsed output.

Examples

# Basic parse
reducto parse document.pdf

# Parse with maximum accuracy (slower)
reducto parse document.pdf --agentic

# Parse a contract with change tracking
reducto parse contract.pdf --change-tracking

# Parse with all metadata
reducto parse document.pdf --hyperlinks --comments --highlights

# Combine flags as needed
reducto parse legal_doc.pdf --agentic --change-tracking --comments

Extract Command

The extract command pulls structured data from documents according to a JSON Schema you provide. It automates information extraction by mapping complex or unstructured documents into machine-readable JSON.

Common Use Cases

  • Extracting line items, totals, vendor/customer info from invoices and receipts
  • Pulling key fields, tables, or sections from contracts or legal documents
  • Capturing form field values from scanned forms or applications
  • Summarizing structured results from reports, statements, or medical records

Schema Guidelines

  • Schemas must be valid JSON Schema documents
  • The top-level schema must be an object ({"type": "object", ...}) β€” inline strings or arrays are not permitted
  • Provide explicit property definitions so the extractor can map fields deterministically
  • Schemas may be supplied as file paths or inline JSON strings

Example Schema

{
  "type": "object",
  "properties": {
    "items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "article_number": {"type": "string"},
          "description": {"type": "string"},
          "quantity": {"type": "number"},
          "unit_price": {"type": "number"},
          "total_price": {"type": "number"}
        },
        "required": [
          "article_number",
          "description",
          "quantity",
          "unit_price",
          "total_price"
        ]
      }
    }
  },
  "required": ["items"]
}
You can reuse parses across multiple extractions: the CLI automatically detects existing .parse.md files, rehydrates the recorded job ID, and uses jobid://<id> references to accelerate extraction jobs.

Edit Command

The edit command modifies documents using natural language instructions. It uploads the document, applies the specified edits, and downloads the resulting file.

Usage

reducto edit path/to/document.pdf --instructions "Your editing instructions here"
reducto edit path/to/document.pdf -i "Your editing instructions here"

Parameters

ParameterRequiredDescription
pathYesPath to a file or directory. Directories are scanned recursively for supported file types.
--instructions, -iYesNatural language instructions describing the edits to apply.

Output

Edited files are saved alongside the original with the naming pattern <filename>.edited.<extension>. For example:
  • invoice.pdf becomes invoice.edited.pdf
  • report.docx becomes report.edited.docx

Examples

reducto edit contract.pdf -i "Fill in the client name as 'Acme Corporation' and set the contract date to January 15, 2024"

reducto edit document.pdf -i "Fill out the form with: Name: John Doe, Email: [email protected], Select 'Yes' for newsletter subscription"

Tips for Effective Instructions

For best results with the --instructions flag:
  • Be specific about what content to modify and how
  • Reference specific elements (headers, footers, tables, specific text)
  • Describe the desired outcome clearly
  • For bulk operations on directories, ensure instructions apply uniformly to all file types

Next Steps