> ## Documentation Index
> Fetch the complete documentation index at: https://docs.reducto.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Rate Limits

> API request limits and optimization strategies

Reducto enforces rate limits to protect infrastructure and ensure fair access for all users. These limits affect how you can call the API, not the total volume you can process.

## Current Limits

| Limit Type              | Value | Description                                                                               |
| ----------------------- | ----- | ----------------------------------------------------------------------------------------- |
| **Concurrent requests** | 200   | Maximum simultaneous requests to sync endpoints (`/parse`, `/extract`, `/split`, `/edit`) |
| **Requests per second** | 500   | Maximum rate of new request submissions                                                   |

### What This Means

**Concurrent requests (200):** If you call the `/parse` endpoint (synchronous), you can have up to 200 parses running simultaneously. Each request occupies a slot until it completes and returns a response.

**Requests per second (500):** Even for heavy users, 500 requests/second is sufficient because it only limits how fast you can submit new jobs, not how many can run in parallel.

## Sync vs Async Behavior

The limits apply differently to synchronous and asynchronous endpoints:

| Endpoint Type   | Concurrency Limit Applies? | Rate Limit Applies? |
| --------------- | -------------------------- | ------------------- |
| `/parse` (sync) | ✅ Yes, counts toward 200   | ✅ Yes               |
| `/parse_async`  | ❌ No, returns immediately  | ✅ Yes               |

**Async endpoints** (`/parse_async`, `/extract_async`, etc.) return immediately with a `job_id`. The actual processing happens in a queue, so you can submit a much larger number of jobs without hitting the concurrency limit.

<CodeGroup>
  ```python Python theme={null}
  # Sync: Each call blocks until complete (counts toward 200 concurrent)
  result = client.parse.run(input=upload.file_id)

  # Async: Returns immediately with job_id (no concurrency limit)
  job = client.parse.run_job(input=upload.file_id)
  # Later: retrieve results
  result = client.job.get(job.job_id)
  ```

  ```javascript Node.js theme={null}
  // Sync: Each call blocks until complete (counts toward 200 concurrent)
  const result = await client.parse.run({ input: 'document.pdf' });

  // Async: Returns immediately with job_id (no concurrency limit)
  const job = await client.parse.runJob({ input: 'document.pdf' });
  // Later: retrieve results
  const result = await client.job.retrieve(job.job_id);
  ```

  ```bash cURL theme={null}
  # Sync: Blocks until complete
  curl -X POST "https://platform.reducto.ai/parse" \
    -H "Authorization: Bearer $REDUCTO_API_KEY" \
    -d '{"input": "https://example.com/document.pdf"}'

  # Async: Returns immediately
  curl -X POST "https://platform.reducto.ai/parse_async" \
    -H "Authorization: Bearer $REDUCTO_API_KEY" \
    -d '{"input": "https://example.com/document.pdf"}'
  # Returns: {"job_id": "abc123"}
  ```
</CodeGroup>

## Handling Rate Limit Errors

When you exceed rate limits, the API returns a `429 Too Many Requests` error. The SDKs automatically retry with exponential backoff.

If you're hitting limits frequently:

1. **Switch to async endpoints** for batch processing
2. **Add delays** between requests (even 100ms helps)
3. **Use separate API keys** if you need isolated rate limits for different applications
4. **Contact support** if you need higher limits for your use case

## Scaling Strategies

### For High Volume Processing

Use async endpoints with webhooks:

<CodeGroup>
  ```python Python theme={null}
  import asyncio
  from reducto import AsyncReducto

  async def process_batch(files: list[str]):
      client = AsyncReducto()
      
      # Submit all jobs (no concurrency limit)
      jobs = await asyncio.gather(*[
          client.parse.run_job(
              input=f,
              async_={"webhook": {"mode": "svix"}}
          )
          for f in files
      ])
      
      return [job.job_id for job in jobs]

  # Results delivered via webhook when complete
  ```

  ```javascript Node.js theme={null}
  async function processBatch(files) {
    const jobs = await Promise.all(
      files.map(f => 
        client.parse.runJob({
          input: f,
          async: { webhook: { mode: 'svix' } }
        })
      )
    );
    
    return jobs.map(job => job.job_id);
  }

  // Results delivered via webhook when complete
  ```
</CodeGroup>

### For Interactive Applications

Use sync endpoints with priority:

<CodeGroup>
  ```python Python theme={null}
  # For user-facing requests, sync with priority ensures fast response
  result = client.parse.run(
      input=upload.file_id,
      # Sync requests are already prioritized over async
  )
  ```

  ```javascript Node.js theme={null}
  // For user-facing requests, sync with priority ensures fast response
  const result = await client.parse.run({
    input: upload.file_id,
    // Sync requests are already prioritized over async
  });
  ```
</CodeGroup>

## Best Practices

1. **Batch with async**: For processing many documents, use async endpoints. Submit all jobs upfront, then collect results via webhooks or polling.

2. **Don't parallelize sync calls excessively**: If you spawn 500 threads each making sync requests, you'll hit the 200 concurrent limit. Use async instead.

3. **Implement backoff**: If you get a 429, wait before retrying. The SDKs handle this automatically.

4. **Monitor usage**: Check your request patterns in [Reducto Studio](https://studio.reducto.ai/) to identify bottlenecks.

***

## Related

<CardGroup cols={2}>
  <Card title="Async Processing" icon="clock" href="/workflows/async-overview">
    Use async endpoints for unlimited concurrent jobs.
  </Card>

  <Card title="Batch Processing" icon="layer-group" href="/workflows/batch-processing">
    Process many documents efficiently.
  </Card>

  <Card title="Credit Usage" icon="coins" href="/reference/credit-usage">
    Understand how credits are consumed.
  </Card>

  <Card title="Error Codes" icon="triangle-exclamation" href="/reference/error-codes">
    Handle API errors including rate limits.
  </Card>
</CardGroup>
