> ## Documentation Index > Fetch the complete documentation index at: https://docs.reducto.ai/llms.txt > Use this file to discover all available pages before exploring further. # API Quickstart > Parse your first document with Reducto in 5 minutes. This guide walks you through your first Reducto API call. You will parse a document and get back structured JSON ready for LLMs, downstream extraction, or any other processing step in your pipeline. *** ## Fastest path for coding agents If you are using Claude Code, Codex, Cursor, or another coding agent, start here. This path avoids Studio clicks and extra docs navigation. 1. Set `REDUCTO_API_KEY`. 2. Choose one interface: * Local file or folder: use the [Reducto CLI](/cli). * Agent tool calling: use the [Reducto MCP server](/mcp-server). * Application code: use the Python, Node.js, Go, or cURL examples below. 3. Parse the sample PDF first, then replace the URL or file path with your document. ```bash CLI theme={null} pip install reducto-cli reducto login curl -L -o fidelity-example.pdf https://cdn.reducto.ai/samples/fidelity-example.pdf reducto parse ./fidelity-example.pdf ``` ```python Python theme={null} from reducto import Reducto client = Reducto() result = client.parse.run(input="https://cdn.reducto.ai/samples/fidelity-example.pdf") print(result.job_id) print(result.result.chunks[0].content[:1000]) ``` ```javascript Node.js theme={null} import Reducto from "reductoai"; const client = new Reducto(); const result = await client.parse.run({ input: "https://cdn.reducto.ai/samples/fidelity-example.pdf", }); console.log(result.job_id); console.log(result.result.chunks[0].content.slice(0, 1000)); ``` ```bash cURL theme={null} curl -X POST "https://platform.reducto.ai/parse" \ -H "Authorization: Bearer $REDUCTO_API_KEY" \ -H "Content-Type: application/json" \ -d '{"input":"https://cdn.reducto.ai/samples/fidelity-example.pdf"}' ``` For MCP, install once with `uvx mcp-server-reducto --login`, then ask the agent to call `parse_document(document_url="https://cdn.reducto.ai/samples/fidelity-example.pdf")`. *** ## What we're going to parse We'll use a financial statement PDF that contains multiple tables, headers, account summaries, and formatted text. This is the kind of complex document that's difficult to process manually but straightforward with Reducto. Finance Statement

Finance Statement

[View the sample PDF in Studio](https://studio.reducto.ai/share/md726aw3w7mfs46659ttkqry0s7se3pd?processor=kh7c9e30evkfb5a4h80dq4xke17sfwck\&fileId=js7e4hrtnh2tsyjqdbz114ceyn7sf1v1) or [download it directly](https://cdn.reducto.ai/samples/fidelity-example.pdf) to follow along. **What we want to extract:** * The portfolio value table with beginning and ending values * Account information including account numbers and types * Income summary broken down by tax category * Top holdings with values and percentages By the end of this guide, you'll have all of this data in structured JSON that you can use in your application. For structured field extraction (e.g., extracting specific account numbers or values into typed fields), see the [/extract endpoint](/extract/overview) after completing this quickstart. *** ## Prerequisites Go to [studio.reducto.ai](https://studio.reducto.ai/) and sign up for a free account. In the Studio sidebar, click **API Keys**, then **Create new API key**. Give it a name and copy the key. Reducto Studio sidebar showing API Keys option

Reducto Studio sidebar showing API Keys option

This allows the SDK to authenticate automatically without hardcoding the key in your code. ```bash theme={null} export REDUCTO_API_KEY="your_api_key_here" ``` ```powershell theme={null} $env:REDUCTO_API_KEY="your_api_key_here" ``` You can also copy the below snippet for your AI coding agent to connect to Reducto via the [MCP Server](/mcp-server).

````markdown theme={null} ## Add Reducto MCP Server ### 1. Authenticate (one-time) ```bash uvx mcp-server-reducto --login ``` This opens your browser to approve access. Your API key is saved to `~/.reducto/config.yaml`. ### 2. Add to your MCP client **Claude Code:** ```bash claude mcp add reducto -- uvx mcp-server-reducto ``` **Claude Desktop**: edit `~/Library/Application Support/Claude/claude_desktop_config.json`: ```json { "mcpServers": { "reducto": { "command": "uvx", "args": ["mcp-server-reducto"] } } } ``` **Cursor**: edit `.cursor/mcp.json`: ```json { "mcpServers": { "reducto": { "command": "uvx", "args": ["mcp-server-reducto"] } } } ``` **VS Code**: edit `.vscode/mcp.json`: ```json { "servers": { "reducto": { "command": "uvx", "args": ["mcp-server-reducto"] } } } ``` ### 3. Use it The server provides these tools: | Tool | What it does | |------|-------------| | `upload_file` | Upload a local file or URL to Reducto (returns `reducto://` URL) | | `parse_document` | Parse a document into structured text, tables, figures | | `extract_data` | Extract structured JSON from a document using a schema | | `split_document` | Segment a document into labeled sections | | `classify_document` | Categorize a document type | | `edit_document` | Fill forms or modify a PDF/DOCX | **Local files:** Use `upload_file` first, e.g. `upload_file("./report.pdf")`, then pass the returned `reducto://` URL to other tools. **Chain operations:** `parse_document` returns a `job_id`. Pass `jobid://` to `extract_data` or `split_document` to skip re-parsing. ````

*** ## Install the SDK Choose your language and install the Reducto SDK: ```bash theme={null} pip install reductoai ``` Requires Python 3.8+. ```bash theme={null} npm install reductoai ``` ```bash theme={null} go get github.com/reductoai/reducto-go-sdk ``` *** ## Parse the document Now let's write the code to parse our financial statement. We'll go through each part step by step. First, we import the Reducto client. When you create a `Reducto()` client without passing an API key, it automatically reads from the `REDUCTO_API_KEY` environment variable you set earlier. ```python theme={null} from reducto import Reducto # The client reads REDUCTO_API_KEY from your environment client = Reducto() ``` Before parsing, you need to upload the document to Reducto's servers. The `upload()` method accepts a file path (as a string) and returns a reference that you'll use in the next step. You can download the sample PDF from [here](https://cdn.reducto.ai/samples/fidelity-example.pdf). ```python theme={null} from pathlib import Path # Upload the PDF file to Reducto upload = client.upload(file=Path("fidelity-example.pdf")) print(f"Uploaded: {upload.file_id}") ``` You can also pass a URL directly to the parse method if your document is already hosted somewhere accessible, like an S3 bucket: ```python theme={null} result = client.parse.run(input="https://cdn.reducto.ai/samples/fidelity-example.pdf") ``` Now we call the `parse.run()` method with the uploaded file reference. This sends the document through Reducto's processing pipeline, which runs OCR, detects layout, extracts tables, and structures everything into chunks. ```python theme={null} # Parse the uploaded document result = client.parse.run(input=upload.file_id) # Check what we got back print(f"Job ID: {result.job_id}") print(f"Pages processed: {result.usage.num_pages}") print(f"Credits used: {result.usage.credits}") print(f"Number of chunks: {len(result.result.chunks)}") ``` The response contains `chunks`, which are logical sections of the document. Each chunk has a `content` field with the full text and a `blocks` field with individual elements like tables, headers, and paragraphs. ```python theme={null} # Loop through each chunk for i, chunk in enumerate(result.result.chunks): print(f"\n=== Chunk {i + 1} ===") print(chunk.content[:500]) # First 500 characters # Look at individual blocks within this chunk for block in chunk.blocks: print(f" [{block.type}] on page {block.bbox.page}") # Tables are returned as HTML by default if block.type == "Table": print(f" Table content: {block.content[:200]}...") ``` Each block has a `type` that tells you what kind of content it is: `Title`, `Section Header`, `Text`, `Table`, `Figure`, `Key Value`, and others. The `bbox` field contains the bounding box coordinates so you know exactly where on the page this content came from. **Complete code:** ```python theme={null} from pathlib import Path from reducto import Reducto client = Reducto() upload = client.upload(file=Path("fidelity-example.pdf")) result = client.parse.run(input=upload.file_id) print(f"Processed {result.usage.num_pages} pages") for chunk in result.result.chunks: print(chunk.content) for block in chunk.blocks: if block.type == "Table": print(f"Found table on page {block.bbox.page}") ``` All Node.js examples use `await` and must be run inside an `async` function, or in a file with top-level await enabled (ES modules with Node.js 14.8+). Import the Reducto client and the `fs` module for reading files. The client automatically uses the `REDUCTO_API_KEY` environment variable for authentication. ```javascript theme={null} import Reducto from 'reductoai'; import fs from 'fs'; // The client reads REDUCTO_API_KEY from your environment const client = new Reducto(); ``` Use `createReadStream` to upload the file to Reducto. This returns a reference you'll use when calling the parse endpoint. You can download the sample PDF from [here](https://cdn.reducto.ai/samples/fidelity-example.pdf). ```javascript theme={null} // Upload the PDF file to Reducto const upload = await client.upload({ file: fs.createReadStream("fidelity-example.pdf") }); console.log(`Uploaded: ${upload.file_id}`); ``` Call `parse.run()` with the uploaded file reference. Reducto processes the document and returns structured content. ```javascript theme={null} // Parse the uploaded document const result = await client.parse.run({ input: upload.file_id }); console.log(`Job ID: ${result.job_id}`); console.log(`Pages processed: ${result.usage.num_pages}`); console.log(`Credits used: ${result.usage.credits}`); console.log(`Number of chunks: ${result.result.chunks.length}`); ``` Loop through the chunks and blocks to access the extracted text, tables, and other elements. ```javascript theme={null} // Loop through each chunk for (let i = 0; i < result.result.chunks.length; i++) { const chunk = result.result.chunks[i]; console.log(`\n=== Chunk ${i + 1} ===`); console.log(chunk.content.substring(0, 500)); // Look at individual blocks within this chunk for (const block of chunk.blocks) { console.log(` [${block.type}] on page ${block.bbox.page}`); if (block.type === "Table") { console.log(` Table content: ${block.content.substring(0, 200)}...`); } } } ``` **Complete code:** ```javascript theme={null} import Reducto from 'reductoai'; import fs from 'fs'; const client = new Reducto(); async function main() { const upload = await client.upload({ file: fs.createReadStream("fidelity-example.pdf") }); const result = await client.parse.run({ input: upload.file_id }); console.log(`Processed ${result.usage.num_pages} pages`); for (const chunk of result.result.chunks) { console.log(chunk.content); for (const block of chunk.blocks) { if (block.type === "Table") { console.log(`Found table on page ${block.bbox.page}`); } } } } main(); ``` The Go SDK is currently in alpha (`v0.1.0-alpha.1`). The API may change in future releases. Import the Reducto client and the option package for configuration. The Go SDK requires you to pass the API key explicitly using `option.WithAPIKey()`. ```go theme={null} package main import ( "context" "fmt" "io" "os" reducto "github.com/reductoai/reducto-go-sdk" "github.com/reductoai/reducto-go-sdk/option" "github.com/reductoai/reducto-go-sdk/shared" ) func main() { // Initialize client with API key from environment client := reducto.NewClient(option.WithAPIKey(os.Getenv("REDUCTO_API_KEY"))) } ``` Open the file and upload it to Reducto. The upload returns a file ID that you'll use for parsing. You can download the sample PDF from [here](https://cdn.reducto.ai/samples/fidelity-example.pdf). ```go theme={null} file, err := os.Open("fidelity-example.pdf") if err != nil { fmt.Printf("Error opening file: %v\n", err) return } defer file.Close() upload, err := client.Upload(context.Background(), reducto.UploadParams{ File: reducto.F[io.Reader](file), }) if err != nil { fmt.Printf("Upload error: %v\n", err) return } fmt.Printf("Uploaded: %s\n", upload.FileID) ``` Call `Parse.Run()` with the file ID. The Go SDK requires you to wrap the file ID with `shared.UnionString()` and then with `reducto.F[...]()` because the SDK uses strongly-typed union parameters. ```go theme={null} result, err := client.Parse.Run(context.Background(), reducto.ParseRunParams{ ParseConfig: reducto.ParseConfigParam{ // The file ID must be wrapped in shared.UnionString() and reducto.F[...]() DocumentURL: reducto.F[reducto.ParseConfigDocumentURLUnionParam]( shared.UnionString(upload.FileID), ), }, }) if err != nil { fmt.Printf("Parse error: %v\n", err) return } fmt.Printf("Job ID: %s\n", result.JobID) fmt.Printf("Pages: %d\n", result.Usage.NumPages) // Note: To view in Studio, construct the URL: https://studio.reducto.ai/job/{job_id} ``` The result contains chunks with extracted content. The `Chunks` field is typed as `interface{}`, so you need to type assert it to `[]shared.ParseResponseResultFullResultChunk` before you can iterate over it. When checking block types, use the SDK constants instead of string comparisons. ```go theme={null} if result.Result.Type == shared.ParseResponseResultTypeFull { // Type assert Chunks from interface{} to the actual type chunks, ok := result.Result.Chunks.([]shared.ParseResponseResultFullResultChunk) if ok { for _, chunk := range chunks { fmt.Println(chunk.Content) for _, block := range chunk.Blocks { // Use SDK constants for block type comparisons if block.Type == shared.ParseResponseResultFullResultChunksBlocksTypeTable { fmt.Printf("Found table on page %d\n", block.Bbox.Page) } } } } } ``` **Complete code:** ```go theme={null} package main import ( "context" "fmt" "io" "os" reducto "github.com/reductoai/reducto-go-sdk" "github.com/reductoai/reducto-go-sdk/option" "github.com/reductoai/reducto-go-sdk/shared" ) func main() { client := reducto.NewClient(option.WithAPIKey(os.Getenv("REDUCTO_API_KEY"))) file, _ := os.Open("fidelity-example.pdf") defer file.Close() upload, _ := client.Upload(context.Background(), reducto.UploadParams{ File: reducto.F[io.Reader](file), }) result, _ := client.Parse.Run(context.Background(), reducto.ParseRunParams{ ParseConfig: reducto.ParseConfigParam{ DocumentURL: reducto.F[reducto.ParseConfigDocumentURLUnionParam]( shared.UnionString(upload.FileID), ), }, }) fmt.Printf("Processed %d pages\n", result.Usage.NumPages) if result.Result.Type == shared.ParseResponseResultTypeFull { chunks, _ := result.Result.Chunks.([]shared.ParseResponseResultFullResultChunk) for _, chunk := range chunks { fmt.Println(chunk.Content) } } } ``` If you prefer not to use an SDK, you can call the API directly with cURL or any HTTP client. First, upload the file to get a file reference: ```bash theme={null} curl -X POST "https://platform.reducto.ai/upload" \ -H "Authorization: Bearer $REDUCTO_API_KEY" \ -F "file=@fidelity-example.pdf" ``` This returns a JSON response with a `file_id`: ```json theme={null} {"file_id": "reducto://abc123def456.pdf"} ``` Use the `file_id` from the previous step as the `input` parameter: ```bash theme={null} curl -X POST "https://platform.reducto.ai/parse" \ -H "Authorization: Bearer $REDUCTO_API_KEY" \ -H "Content-Type: application/json" \ -d '{"input": "reducto://abc123def456.pdf"}' ``` You can also skip the upload step if your document is already hosted at a public URL: ```bash theme={null} curl -X POST "https://platform.reducto.ai/parse" \ -H "Authorization: Bearer $REDUCTO_API_KEY" \ -H "Content-Type: application/json" \ -d '{"input": "https://cdn.reducto.ai/samples/fidelity-example.pdf"}' ``` *** ## Understanding the response Here's what we got back from parsing our financial statement: ```json theme={null} { "job_id": "5df31070-8d98-4caa-9a5b-c5c511a03f71", "duration": 11.35, "usage": { "num_pages": 3, "credits": 4.0 }, "result": { "chunks": [ { "content": "# *** SAMPLE STATEMENT ***\nFor informational purposes only\n\nFidelity\nINVESTMENTS\n\n## Your Portfolio Value:\n\n$274,222.20\n\n| | This Period | Year-to-Date |\n|-|-|-|\n| Beginning Portfolio Value | $253,221.83 | $232,643.16 |\n| Additions | 59,269.64 | 121,433.55 |...", "blocks": [ { "type": "Title", "content": "*** SAMPLE STATEMENT ***\nFor informational purposes only", "bbox": {"page": 1, "left": 0.351, "top": 0.029, "width": 0.296, "height": 0.057}, "confidence": "high" }, { "type": "Section Header", "content": "Your Portfolio Value:", "bbox": {"page": 1, "left": 0.517, "top": 0.163, "width": 0.153, "height": 0.015}, "confidence": "high" }, { "type": "Table", "content": "| | This Period | Year-to-Date |\n|-|-|-|\n| Beginning Portfolio Value | $253,221.83 | $232,643.16 |\n| Additions | 59,269.64 | 121,433.55 |\n| Subtractions | -45,430.74 | -98,912.58 |\n| Transaction Costs, Fees & Charges | -139.77 | -625.87 |\n| Change in Investment Value* | 7,161.47 | 19,058.07 |\n| Ending Portfolio Value** | $274,222.20 | $274,222.20 |", "bbox": {"page": 1, "left": 0.516, "top": 0.261, "width": 0.444, "height": 0.158}, "confidence": "high" } ] } ] }, "studio_link": "https://studio.reducto.ai/job/5df31070-8d98-4caa-9a5b-c5c511a03f71" } ``` **Key fields:** | Field | What it contains | | ------------------ | ---------------------------------------------------------------------------------------------- | | `job_id` | Unique identifier for this job. Use it to retrieve results later or debug in Studio. | | `usage.num_pages` | Number of pages that were processed. | | `usage.credits` | Credits consumed by this request. | | `chunks` | Logical sections of the document, optimized for feeding into LLMs. | | `chunks[].content` | The full text content of this chunk. | | `chunks[].blocks` | Individual elements (tables, headers, text) with their types and positions. | | `blocks[].type` | What kind of element this is: `Title`, `Table`, `Section Header`, `Text`, `Figure`, etc. | | `blocks[].bbox` | Bounding box with normalized coordinates (0-1) showing where this element appears on the page. | | `studio_link` | Direct link to view this job in Reducto Studio for visual debugging. | *** ## Customizing the output The default settings work well for most documents, but you can customize the parsing behavior for specific use cases. You can pass configuration options as `TypedDict` imports from `reducto.types` or as plain dictionaries: ```python theme={null} from reducto.types import EnhanceParam, FormattingParam, SettingsParam result = client.parse.run( input=upload.file_id, enhance=EnhanceParam( # Use AI to clean up OCR errors in scanned documents agentic=[{"scope": "text"}], # Generate descriptions for charts and images summarize_figures=True ), formatting=FormattingParam( # Get tables as HTML, md, json, or csv table_output_format="md" ), settings=SettingsParam( # Only process pages 1-5 page_range={"start": 1, "end": 5} ) ) ``` You can also pass plain dictionaries instead of `TypedDict` imports. Both work identically. ```javascript theme={null} const result = await client.parse.run({ input: upload.file_id, enhance: { agentic: [{scope: "text"}], summarize_figures: true }, formatting: { table_output_format: "md" }, settings: { page_range: {start: 1, end: 5} } }); ``` ```go theme={null} result, err := client.Parse.Run(context.Background(), reducto.ParseRunParams{ ParseConfig: reducto.ParseConfigParam{ DocumentURL: reducto.F[reducto.ParseConfigDocumentURLUnionParam]( shared.UnionString(upload.FileID), ), // Output formatting - use SDK constants for table format AdvancedOptions: reducto.F(shared.AdvancedProcessingOptionsParam{ TableOutputFormat: reducto.F(shared.AdvancedProcessingOptionsTableOutputFormatMd), }), // Chunking options Options: reducto.F(shared.BaseProcessingOptionsParam{ Chunking: reducto.F(shared.BaseProcessingOptionsChunkingParam{ ChunkMode: reducto.F(shared.BaseProcessingOptionsChunkingChunkModeVariable), }), }), }, }) ``` The Go SDK uses different parameter names than Python and Node.js: | Python/Node.js | Go SDK | | -------------------------------- | ----------------------------------- | | `formatting.table_output_format` | `AdvancedOptions.TableOutputFormat` | | `settings.page_range` | `AdvancedOptions.PageRange` | | `retrieval.chunking.chunk_mode` | `Options.Chunking.ChunkMode` | Use SDK constants like `AdvancedProcessingOptionsTableOutputFormatMd` instead of strings, and wrap values with `reducto.F()`. ```bash theme={null} curl -X POST "https://platform.reducto.ai/parse" \ -H "Authorization: Bearer $REDUCTO_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "input": "reducto://abc123def456.pdf", "enhance": { "agentic": [{"scope": "text"}], "summarize_figures": true }, "formatting": { "table_output_format": "md" }, "settings": { "page_range": {"start": 1, "end": 5} } }' ``` **What these options do:** * **`enhance.agentic`**: Runs AI-powered cleanup on the specified scope. Use `"text"` for OCR correction on scanned documents, or `"table"` to improve table structure detection. * **`enhance.summarize_figures`**: Generates natural language descriptions of charts, graphs, and images. Useful for RAG pipelines where you need to search figure content. * **`formatting.table_output_format`**: Controls how tables are returned. Options are `html`, `md` (markdown), `json`, `csv`, `dynamic` (default, returns markdown for simple tables and HTML for complex ones), or `jsonbbox`. * **`settings.page_range`**: Limits processing to specific pages. Useful for large documents where you only need certain sections. For the full list of options, see the [Parse configuration reference](/configs/overview). *** ## What's next Now that you can parse documents, explore the other Reducto endpoints: Define a JSON schema and extract specific fields from your documents. Divide long documents into sections based on content type. Fill PDF forms and modify DOCX documents programmatically. Process documents asynchronously with webhooks for high-volume workloads. *** ## Troubleshooting This means your API key is missing or invalid. Check that the `REDUCTO_API_KEY` environment variable is set correctly and that the key hasn't expired in Studio. Some complex tables need extra help. Enable `enhance.agentic` with `[{"scope": "table"}]` for AI-powered table reconstruction, or try `formatting.table_output_format` set to `"html"` or `"json"` for more structured output. For scanned documents or low-quality PDFs, enable the agentic text enhancement: `enhance.agentic: [{"scope": "text"}]`. If the document is password-protected, pass the password in `settings.document_password`. This may also be due to bad metadata polluting the output, in which case, reach out to Reducto support. Every response includes a `studio_link` that opens the job in Reducto Studio. Use it to visually inspect what was extracted and debug any issues.