run() and run_job() methods that map to different API endpoints:
| SDK Method | API Endpoint | Returns |
|---|---|---|
client.parse.run() | POST /parse | Full result (blocks until complete) |
client.parse.run_job() | POST /parse_async | Job ID (returns immediately) |
/extract vs /extract_async, /split vs /split_async, and /pipeline vs /pipeline_async.
The Go SDK is currently in alpha and has limited async support. Go users should use the REST API directly for async operations. See the cURL examples below.
run() vs run_job()
| Method | Behavior | Best for |
|---|---|---|
run() | Calls sync endpoint, blocks until complete | Interactive applications, smaller documents |
run_job() | Calls async endpoint, returns job ID | Large documents, high volume, background processing |
Synchronous: run()
run() method handles the job lifecycle internally. If the document takes too long, the request may time out. For documents over 50 pages or complex processing, consider using run_job() instead.
Asynchronous: run_job()
run_job() method has no limit on concurrent submissions. You can queue thousands of documents and process them in parallel without managing connections or timeouts.
Job lifecycle
When you submit a job via the async endpoint, it moves through these states:| Status | Meaning |
|---|---|
Pending | Job is queued, waiting for a worker |
InProgress | A worker is actively processing the document |
Completing | Processing finished, results being saved |
Completed | Results are ready to retrieve |
Failed | Processing failed (check error message) |
Pending (waiting for capacity) or InProgress (actual processing).
Polling for results
The simplest way to get results from an async job is to poll the job status:Priority processing
By default, synchronous (run()) jobs are prioritized over asynchronous (run_job()) jobs. This ensures interactive requests get fast responses while background jobs process when capacity is available.
You can request priority processing for async jobs if your account has priority budget available:
Async endpoints
Every Reducto endpoint has a corresponding async variant:| Sync Endpoint | Async Endpoint | SDK Method |
|---|---|---|
POST /parse | POST /parse_async | client.parse.run_job() |
POST /extract | POST /extract_async | client.extract.run_job() |
POST /split | POST /split_async | client.split.run_job() |
POST /pipeline | POST /pipeline_async | client.pipeline.run_job() |
Using metadata
Include metadata with your job submission to help identify and route results:When to use async
Userun() when:
- Processing single documents interactively
- Document size is small (under 20 pages)
- You need results immediately in the same request
- Testing and development
run_job() / async endpoints when:
- Processing many documents in parallel
- Documents are large or complex
- You want fire-and-forget with webhook notification
- Building batch processing pipelines
- Processing in background workers
Job Retention
Default behavior: Job results are retained for 12 hours. After this window, you’ll need to reprocess the document. For longer retention: Enablepersist_results to keep results indefinitely:
persist_results requires opting in to Reducto Studio. Contact support to enable this feature for your organization.API Reference
See the full API documentation for async endpoints:- Parse Async -
POST /parse_async - Extract Async -
POST /extract_async - Split Async -
POST /split_async - Pipeline Async -
POST /pipeline_async - Get Job -
GET /job/{job_id}
Related
Svix Webhooks
Get notified when jobs complete instead of polling.
Batch Processing
Process many documents in parallel with run().
Chaining Endpoints
Reuse parsed documents across multiple calls.
Pipeline Basics
Bundle workflows into a single API call.