Async Processing

The JavaScript SDK is inherently async (Promise-based). This guide covers concurrent document processing and async job management for long-running operations.

Concurrent Processing

Process multiple documents simultaneously with Promise.all:

Parse Multiple Documents

import Reducto from 'reductoai';
import fs from 'fs';

const client = new Reducto();
const files = ["doc1.pdf", "doc2.pdf", "doc3.pdf"];

// Upload all files concurrently
const uploads = await Promise.all(
  files.map(f => client.upload({ file: fs.createReadStream(f) }))
);

// Parse all documents concurrently
const results = await Promise.all(
  uploads.map(upload => client.parse.run({ input: upload.file_id }))
);

console.log(`Processed ${results.length} documents`);

Extract from Multiple Documents

const schema = {
  type: "object",
  properties: {
    invoice_number: { type: "string" },
    total: { type: "number" }
  }
};

const uploads = await Promise.all(
  files.map(f => client.upload({ file: fs.createReadStream(f) }))
);

const results = await Promise.all(
  uploads.map(upload => 
    client.extract.run({
      input: upload.file_id,
      instructions: { schema }
    })
  )
);

Split Multiple Documents

const splitDesc = [
  { name: "Summary", description: "Executive summary" },
  { name: "Details", description: "Detailed content" }
];

const results = await Promise.all(
  uploads.map(upload =>
    client.split.run({
      input: upload.file_id,
      split_description: splitDesc
    })
  )
);

Fill Multiple Forms

const forms = [
  { file: "form1.pdf", instructions: "Fill name with 'Alice'" },
  { file: "form2.pdf", instructions: "Fill name with 'Bob'" },
  { file: "form3.pdf", instructions: "Fill name with 'Charlie'" }
];

async function fillForm({ file, instructions }) {
  const upload = await client.upload({ file: fs.createReadStream(file) });
  return client.edit.run({
    document_url: upload.file_id,
    edit_instructions: instructions
  });
}

const results = await Promise.all(forms.map(fillForm));

Rate Limiting

Control concurrency to avoid overwhelming the API:

async function processWithLimit(files, maxConcurrent = 5) {
  const results = [];
  
  for (let i = 0; i < files.length; i += maxConcurrent) {
    const batch = files.slice(i, i + maxConcurrent);
    const batchResults = await Promise.all(
      batch.map(async (file) => {
        const upload = await client.upload({ file: fs.createReadStream(file) });
        return client.parse.run({ input: upload.file_id });
      })
    );
    results.push(...batchResults);
    console.log(`Processed ${results.length}/${files.length}`);
  }
  
  return results;
}

// Process 100 files, 5 at a time
const results = await processWithLimit(fileList, 5);

Async Jobs (runJob)

For documents that take longer to process, use runJob methods to avoid timeouts:

Start Async Job

// Start job without waiting for completion
const job = await client.parse.runJob({ input: upload.file_id });
console.log(`Job started: ${job.job_id}`);

Poll for Completion

async function waitForJob(jobId) {
  while (true) {
    const job = await client.job.get(jobId);
    
    if (job.status === "Completed") {
      return job.result;
    } else if (job.status === "Failed") {
      throw new Error(`Job failed: ${job.reason}`);
    }
    
    console.log(`Status: ${job.status}, Progress: ${job.progress || 0}%`);
    await new Promise(r => setTimeout(r, 2000)); // Wait 2 seconds
  }
}

const job = await client.parse.runJob({ input: upload.file_id });
const result = await waitForJob(job.job_id);

All Async Methods

Each endpoint has a runJob variant:

// Parse
const parseJob = await client.parse.runJob({ input: upload.file_id });

// Extract
const extractJob = await client.extract.runJob({
  input: upload.file_id,
  instructions: { schema }
});

// Split
const splitJob = await client.split.runJob({
  input: upload.file_id,
  split_description: splitDesc
});

// Edit
const editJob = await client.edit.runJob({
  document_url: upload.file_id,
  edit_instructions: "Fill the form"
});

// Pipeline
const pipelineJob = await client.pipeline.runJob({
  input: upload.file_id,
  pipeline_id: "your_pipeline_id"
});

Error Handling

Use Promise.allSettled to handle failures gracefully:

async function batchWithErrorHandling(files) {
  const results = await Promise.allSettled(
    files.map(async (file) => {
      const upload = await client.upload({ file: fs.createReadStream(file) });
      return client.parse.run({ input: upload.file_id });
    })
  );
  
  results.forEach((result, i) => {
    if (result.status === "fulfilled") {
      console.log(`${files[i]}: ${result.value.usage.num_pages} pages`);
    } else {
      console.error(`${files[i]}: Failed - ${result.reason}`);
    }
  });
  
  return results;
}

When to Use Async Jobs

Use runJob When

Processing large documents (50+ pages), running long extract operations, or integrating with webhooks.

Use run When

Processing small documents, need immediate results, or simple scripts where waiting is acceptable.

Next Steps

See job management for the complete job API
Check error handling for the exception reference

Get Started

Core Methods

Utilities

Concurrent Processing

Parse Multiple Documents

Extract from Multiple Documents

Split Multiple Documents

Fill Multiple Forms

Rate Limiting

Async Jobs (runJob)

Start Async Job

Poll for Completion

All Async Methods

Error Handling

When to Use Async Jobs

Use runJob When

Use run When

Next Steps

Get Started

Core Methods

Utilities

​Concurrent Processing

​Parse Multiple Documents

​Extract from Multiple Documents

​Split Multiple Documents

​Fill Multiple Forms

​Rate Limiting

​Async Jobs (runJob)

​Start Async Job

​Poll for Completion

​All Async Methods

​Error Handling

​When to Use Async Jobs

Use runJob When

Use run When

​Next Steps

Concurrent Processing

Parse Multiple Documents

Extract from Multiple Documents

Split Multiple Documents

Fill Multiple Forms

Rate Limiting

Async Jobs (runJob)

Start Async Job

Poll for Completion

All Async Methods

Error Handling

When to Use Async Jobs

Next Steps