Rate Limits

Reducto enforces rate limits to protect infrastructure and ensure fair access for all users. These limits affect how you can call the API, not the total volume you can process.

Current Limits

Limit Type	Value	Description
Concurrent requests	200	Maximum simultaneous requests to sync endpoints (`/parse`, `/extract`, `/split`, `/edit`)
Requests per second	500	Maximum rate of new request submissions

What This Means

Concurrent requests (200): If you call the /parse endpoint (synchronous), you can have up to 200 parses running simultaneously. Each request occupies a slot until it completes and returns a response. Requests per second (500): Even for heavy users, 500 requests/second is sufficient because it only limits how fast you can submit new jobs, not how many can run in parallel.

Sync vs Async Behavior

The limits apply differently to synchronous and asynchronous endpoints:

Endpoint Type	Concurrency Limit Applies?	Rate Limit Applies?
`/parse` (sync)	✅ Yes, counts toward 200	✅ Yes
`/parse_async`	❌ No, returns immediately	✅ Yes

Async endpoints (/parse_async, /extract_async, etc.) return immediately with a job_id. The actual processing happens in a queue, so you can submit a much larger number of jobs without hitting the concurrency limit.

# Sync: Each call blocks until complete (counts toward 200 concurrent)
result = client.parse.run(input=upload.file_id)

# Async: Returns immediately with job_id (no concurrency limit)
job = client.parse.run_job(input=upload.file_id)
# Later: retrieve results
result = client.job.get(job.job_id)

Handling Rate Limit Errors

When you exceed rate limits, the API returns a 429 Too Many Requests error. The SDKs automatically retry with exponential backoff. If you’re hitting limits frequently:

Switch to async endpoints for batch processing
Add delays between requests (even 100ms helps)
Use separate API keys if you need isolated rate limits for different applications
Contact support if you need higher limits for your use case

Scaling Strategies

For High Volume Processing

Use async endpoints with webhooks:

import asyncio
from reducto import AsyncReducto

async def process_batch(files: list[str]):
    client = AsyncReducto()
    
    # Submit all jobs (no concurrency limit)
    jobs = await asyncio.gather(*[
        client.parse.run_job(
            input=f,
            async_={"webhook": {"mode": "svix"}}
        )
        for f in files
    ])
    
    return [job.job_id for job in jobs]

# Results delivered via webhook when complete

For Interactive Applications

Use sync endpoints with priority:

# For user-facing requests, sync with priority ensures fast response
result = client.parse.run(
    input=upload.file_id,
    # Sync requests are already prioritized over async
)

Best Practices

Batch with async: For processing many documents, use async endpoints. Submit all jobs upfront, then collect results via webhooks or polling.
Don’t parallelize sync calls excessively: If you spawn 500 threads each making sync requests, you’ll hit the 200 concurrent limit. Use async instead.
Implement backoff: If you get a 429, wait before retrying. The SDKs handle this automatically.
Monitor usage: Check your request patterns in Reducto Studio to identify bottlenecks.

Async Processing

Use async endpoints for unlimited concurrent jobs.

Batch Processing

Process many documents efficiently.

Credit Usage

Understand how credits are consumed.

Error Codes

Handle API errors including rate limits.

Get Started

Core Functions

Workflows and Pipelines

Configurations

Reference

Components

Enterprise Resources

Security and privacy

On-premise Resources

Current Limits

What This Means

Sync vs Async Behavior

Handling Rate Limit Errors

Scaling Strategies

For High Volume Processing

For Interactive Applications

Best Practices

Async Processing

Batch Processing

Credit Usage

Error Codes

Get Started

Core Functions

Workflows and Pipelines

Configurations

Reference

Components

Enterprise Resources

Security and privacy

On-premise Resources

​Current Limits

​What This Means

​Sync vs Async Behavior

​Handling Rate Limit Errors

​Scaling Strategies

​For High Volume Processing

​For Interactive Applications

​Best Practices

​Related

Async Processing

Batch Processing

Credit Usage

Error Codes

Current Limits

What This Means

Sync vs Async Behavior

Handling Rate Limit Errors

Scaling Strategies

For High Volume Processing

For Interactive Applications

Best Practices

Related