Uploading Large Files

For files larger than 100MB, use the presigned URL method. This uploads directly to cloud storage, bypassing the 100MB limit of the standard Upload endpoint.

Method	Max Size	When to Use
Direct upload	100MB	Most files
Presigned URL (this page)	5GB	Large PDFs, high-res scans, large spreadsheets

How It Works

Request a presigned URL from Reducto (no file attached)
Upload your file directly to cloud storage using the presigned URL
Use the file_id with Parse, Split, or Extract endpoints

Step 1: Request a Presigned URL

Call the upload endpoint without attaching a file:

import os
import requests

response = requests.post(
    "https://platform.reducto.ai/upload",
    headers={"Authorization": f"Bearer {os.environ.get('REDUCTO_API_KEY')}"}
)

data = response.json()
file_id = data["file_id"]
presigned_url = data["presigned_url"]

print(f"File ID: {file_id}")
print(f"Presigned URL: {presigned_url[:80]}...")

Response:

{
  "file_id": "reducto://50c07046-3bac-4844-8c4b-d1428ed9c8f4",
  "presigned_url": "https://prod-storage.s3.amazonaws.com/50c07046-3bac-4844-8c4b-d1428ed9c8f4?X-Amz-Algorithm=AWS4-HMAC-SHA256&..."
}

Save the file_id now. You’ll need it in Step 3. The presigned URL is only for uploading — you can’t use it to process the document.

Step 2: Upload to Presigned URL

Upload your file using a PUT request to the presigned URL:

import requests

with open("large_document.pdf", "rb") as f:
    response = requests.put(presigned_url, data=f)

if response.status_code == 200:
    print("Upload successful!")

No Content-Type header needed. When uploading to presigned URLs, you don’t need to set a Content-Type header — the file will be accepted as-is.

Step 3: Process with Parse, Split, or Extract

Use the file_id from Step 1 (not the presigned URL) with any Reducto endpoint:

from reducto import Reducto

client = Reducto()

# Use the file_id from Step 1
result = client.parse.run(input=file_id)

print(f"Processed {result.usage.num_pages} pages")

Complete Example

Here’s the full workflow in one script:

import os
import requests
from reducto import Reducto

# Step 1: Get presigned URL
response = requests.post(
    "https://platform.reducto.ai/upload",
    headers={"Authorization": f"Bearer {os.environ.get('REDUCTO_API_KEY')}"}
)
data = response.json()
file_id = data["file_id"]
presigned_url = data["presigned_url"]

# Step 2: Upload to presigned URL
with open("large_document.pdf", "rb") as f:
    requests.put(presigned_url, data=f)

# Step 3: Process with Reducto
client = Reducto()
result = client.parse.run(input=file_id)

print(f"Successfully processed {result.usage.num_pages} pages")
for chunk in result.result.chunks:
    print(chunk.content[:200])

Troubleshooting

403 Forbidden on presigned URL

Cause: The presigned URL has expired.Fix: Presigned URLs expire after a short time (typically 1 hour). Request a new presigned URL and try again.

Upload succeeds but Parse fails

Cause: You might be passing the presigned_url instead of the file_id.Fix: Always use the file_id (starts with reducto://) with Parse, Split, or Extract — not the presigned URL.

Timeout during upload

Cause: Large files on slow connections can timeout.Fix:

Use a wired connection if possible
Consider chunked/multipart upload for files >1GB
Implement retry logic with exponential backoff

Unexpected upload errors

Cause: Using incompatible upload methods or headers.Fix:

Don’t include a Content-Type header — presigned URLs don’t require it
For cURL, use -T filename instead of --data-binary @filename
In Go, use bytes.NewReader() to ensure proper Content-Length handling

Direct Upload

For files under 100MB — simpler, one-step upload.

Parse

Extract text, tables, and figures from uploaded documents.

Batch Processing

Process many large files in parallel.

Async Processing

Use webhooks for long-running jobs.

Get Started

Migration

Core Functions

Configurations

FAQ

Security and privacy

On-premise deployment

Uploading Large Files

How It Works

Step 1: Request a Presigned URL

Step 2: Upload to Presigned URL

Step 3: Process with Parse, Split, or Extract

Complete Example

Troubleshooting

Direct Upload

Parse

Batch Processing

Async Processing

Get Started

Migration

Core Functions

Configurations

FAQ

Security and privacy

On-premise deployment

​How It Works

​Step 1: Request a Presigned URL

​Step 2: Upload to Presigned URL

​Step 3: Process with Parse, Split, or Extract

​Complete Example

​Troubleshooting

​Related

Direct Upload

Parse

Batch Processing

Async Processing

How It Works

Step 1: Request a Presigned URL

Step 2: Upload to Presigned URL

Step 3: Process with Parse, Split, or Extract

Complete Example

Troubleshooting

Related