Skip to main content
For files larger than 100MB, use the presigned URL method. This uploads directly to cloud storage, bypassing the 100MB limit of the standard Upload endpoint.
MethodMax SizeWhen to Use
Direct upload100MBMost files
Presigned URL (this page)5GBLarge PDFs, high-res scans, large spreadsheets

How It Works

  1. Request a presigned URL from Reducto (no file attached)
  2. Upload your file directly to cloud storage using the presigned URL
  3. Use the file_id with Parse, Split, or Extract endpoints

Step 1: Request a Presigned URL

Call the upload endpoint without attaching a file:
import os
import requests

response = requests.post(
    "https://platform.reducto.ai/upload",
    headers={"Authorization": f"Bearer {os.environ.get('REDUCTO_API_KEY')}"}
)

data = response.json()
file_id = data["file_id"]
presigned_url = data["presigned_url"]

print(f"File ID: {file_id}")
print(f"Presigned URL: {presigned_url[:80]}...")
Response:
{
  "file_id": "reducto://50c07046-3bac-4844-8c4b-d1428ed9c8f4",
  "presigned_url": "https://prod-storage.s3.amazonaws.com/50c07046-3bac-4844-8c4b-d1428ed9c8f4?X-Amz-Algorithm=AWS4-HMAC-SHA256&..."
}
Save the file_id now. You’ll need it in Step 3. The presigned URL is only for uploading β€” you can’t use it to process the document.

Step 2: Upload to Presigned URL

Upload your file using a PUT request to the presigned URL:
import requests

with open("large_document.pdf", "rb") as f:
    response = requests.put(presigned_url, data=f)

if response.status_code == 200:
    print("Upload successful!")
No Content-Type header needed. When uploading to presigned URLs, you don’t need to set a Content-Type header β€” the file will be accepted as-is.

Step 3: Process with Parse, Split, or Extract

Use the file_id from Step 1 (not the presigned URL) with any Reducto endpoint:
from reducto import Reducto

client = Reducto()

# Use the file_id from Step 1
result = client.parse.run(input=file_id)

print(f"Processed {result.usage.num_pages} pages")

Complete Example

Here’s the full workflow in one script:
import os
import requests
from reducto import Reducto

# Step 1: Get presigned URL
response = requests.post(
    "https://platform.reducto.ai/upload",
    headers={"Authorization": f"Bearer {os.environ.get('REDUCTO_API_KEY')}"}
)
data = response.json()
file_id = data["file_id"]
presigned_url = data["presigned_url"]

# Step 2: Upload to presigned URL
with open("large_document.pdf", "rb") as f:
    requests.put(presigned_url, data=f)

# Step 3: Process with Reducto
client = Reducto()
result = client.parse.run(input=file_id)

print(f"Successfully processed {result.usage.num_pages} pages")
for chunk in result.result.chunks:
    print(chunk.content[:200])

Troubleshooting

Cause: The presigned URL has expired.Fix: Presigned URLs expire after a short time (typically 1 hour). Request a new presigned URL and try again.
Cause: You might be passing the presigned_url instead of the file_id.Fix: Always use the file_id (starts with reducto://) with Parse, Split, or Extract β€” not the presigned URL.
Cause: Large files on slow connections can timeout.Fix:
  • Use a wired connection if possible
  • Consider chunked/multipart upload for files >1GB
  • Implement retry logic with exponential backoff
Cause: Using incompatible upload methods or headers.Fix:
  • Don’t include a Content-Type header β€” presigned URLs don’t require it
  • For cURL, use -T filename instead of --data-binary @filename
  • In Go, use bytes.NewReader() to ensure proper Content-Length handling