Skip to main content
The split.run() method divides documents into sections based on descriptions you provide. You define what sections to look for, and Split identifies which pages belong to each section.

Basic Usage

import Reducto from 'reductoai';
import fs from 'fs';

const client = new Reducto();

// Upload
const upload = await client.upload({ 
  file: fs.createReadStream("document.pdf") 
});

// Split the document - split_description is required
const result = await client.split.run({
  input: upload.file_id,
  split_description: [
    { name: "Summary", description: "Executive summary or overview section" },
    { name: "Financial Data", description: "Tables with financial figures" },
    { name: "Notes", description: "Footnotes or additional notes" }
  ]
});

// Access splits
for (const split of result.result.splits) {
  console.log(`Section: ${split.name}`);
  console.log(`Pages: ${split.pages}`);  // Array of page numbers
  console.log(`Confidence: ${split.conf}`);  // "high" or "low"
}

Method Signatures

Synchronous Split

split.run(params: {
  input: string | Upload;
  split_description: Array<SplitCategory>;
  parsing?: ParseOptions;
  settings?: {
    table_cutoff?: "truncate" | "preserve";
  };
  split_rules?: string;
}, options?: RequestOptions): Promise<SplitResponse>

Asynchronous Split

split.runJob(params: {
  input: string | Upload;
  split_description: Array<SplitCategory>;
  async?: ConfigV3AsyncConfig;
  parsing?: ParseOptions;
  settings?: {
    table_cutoff?: "truncate" | "preserve";
  };
  split_rules?: string;
}, options?: RequestOptions): Promise<SplitRunJobResponse>

Split Description

The split_description parameter is required. Each entry defines a section to find:
const splitDescription = [
  {
    name: "Cover Page",
    description: "Title page with company logo and report title"
  },
  {
    name: "Table of Contents",
    description: "Page listing all sections with page numbers"
  },
  {
    name: "Financial Statements",
    description: "Balance sheet, income statement, and cash flow tables"
  }
];

const result = await client.split.run({
  input: upload.file_id,
  split_description: splitDescription
});

With Partition Key

Use partition_key when a section type repeats multiple times and you want to group by a specific identifier:
const result = await client.split.run({
  input: upload.file_id,
  split_description: [
    {
      name: "Invoice",
      description: "Individual invoice with line items and total",
      partition_key: "invoice_number"  // Group pages by invoice number
    }
  ]
});

Configuration Examples

Split Rules

The split_rules parameter is a natural language prompt that controls how pages are classified:
const result = await client.split.run({
  input: upload.file_id,
  split_description: [
    { name: "Summary", description: "Executive summary" },
    { name: "Details", description: "Detailed content" }
  ],
  split_rules: "Pages can belong to multiple sections if they contain content from both."
});

Table Settings

const result = await client.split.run({
  input: upload.file_id,
  split_description: [
    { name: "Tables", description: "Data tables" }
  ],
  settings: {
    table_cutoff: "preserve"  // Keep all table rows (default: "truncate")
  }
});

Response Structure

const result = await client.split.run({ ... });

// Access splits
for (const split of result.result.splits) {
  console.log(split.name);       // string: Section name you defined
  console.log(split.pages);      // number[]: Page numbers (1-indexed)
  console.log(split.conf);       // string: "high" or "low"
  console.log(split.partitions); // array | null: Sub-sections when partition_key is used
}

Next Steps