Reducto provides async SDKs that can help you process a huge number of documents, autoscaling to handle even the largest jobs with ease. This pattern works across all endpoints: /parse, /extract, and /split.
Make sure you have a Reducto SDK set up, see how in Quickstart.
If you’d rather queue long documents and be notified when they’re done (via polling or webhooks), check out run_job() for a fire-and-forget model. The run_job method also has no limits on how many requests you can send concurrently.
Copy
Ask AI
from reducto import AsyncReductofrom pathlib import Pathimport asynciofrom tqdm.asyncio import tqdmclient = AsyncReducto()MAX_CONCURRENCY = 1000FILES_TO_PARSE = list(Path("docs").glob("*.pdf"))async def main(): sem = asyncio.Semaphore(MAX_CONCURRENCY) async def parse_document(path: Path): async with sem: upload = await client.upload(file=path) result = await client.parse.run(document_url=upload) output_path = path.with_suffix(".reducto.json") output_path.write_text(result.model_dump_json()) await tqdm.gather( *[parse_document(path) for path in FILES_TO_PARSE], desc="Parsing documents" )if __name__ == "__main__": asyncio.run(main())