Current Limits
| Limit Type | Value | Description |
|---|---|---|
| Concurrent requests | 200 | Maximum simultaneous requests to sync endpoints (/parse, /extract, /split, /edit) |
| Requests per second | 500 | Maximum rate of new request submissions |
What This Means
Concurrent requests (200): If you call the/parse endpoint (synchronous), you can have up to 200 parses running simultaneously. Each request occupies a slot until it completes and returns a response.
Requests per second (500): Even for heavy users, 500 requests/second is sufficient because it only limits how fast you can submit new jobs, not how many can run in parallel.
Sync vs Async Behavior
The limits apply differently to synchronous and asynchronous endpoints:| Endpoint Type | Concurrency Limit Applies? | Rate Limit Applies? |
|---|---|---|
/parse (sync) | ✅ Yes, counts toward 200 | ✅ Yes |
/parse_async | ❌ No, returns immediately | ✅ Yes |
/parse_async, /extract_async, etc.) return immediately with a job_id. The actual processing happens in a queue, so you can submit a much larger number of jobs without hitting the concurrency limit.
Handling Rate Limit Errors
When you exceed rate limits, the API returns a429 Too Many Requests error. The SDKs automatically retry with exponential backoff.
If you’re hitting limits frequently:
- Switch to async endpoints for batch processing
- Add delays between requests (even 100ms helps)
- Use separate API keys if you need isolated rate limits for different applications
- Contact support if you need higher limits for your use case
Scaling Strategies
For High Volume Processing
Use async endpoints with webhooks:For Interactive Applications
Use sync endpoints with priority:Best Practices
- Batch with async: For processing many documents, use async endpoints. Submit all jobs upfront, then collect results via webhooks or polling.
- Don’t parallelize sync calls excessively: If you spawn 500 threads each making sync requests, you’ll hit the 200 concurrent limit. Use async instead.
- Implement backoff: If you get a 429, wait before retrying. The SDKs handle this automatically.
- Monitor usage: Check your request patterns in Reducto Studio to identify bottlenecks.