> ## Documentation Index
> Fetch the complete documentation index at: https://docs.reducto.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Uploading Large Files

> Upload files up to 5GB using presigned URLs

For files larger than 100MB, use the presigned URL method. This uploads directly to cloud storage, bypassing the 100MB limit of the standard [Upload endpoint](/upload).

| Method                    | Max Size | When to Use                                    |
| ------------------------- | -------- | ---------------------------------------------- |
| [Direct upload](/upload)  | 100MB    | Most files                                     |
| Presigned URL (this page) | 5GB      | Large PDFs, high-res scans, large spreadsheets |

***

## How It Works

```mermaid theme={null}
sequenceDiagram
    participant You
    participant Reducto API
    participant Cloud Storage
    
    You->>Reducto API: 1. Request presigned URL
    Reducto API-->>You: file_id + presigned_url
    You->>Cloud Storage: 2. Upload file to presigned URL
    Cloud Storage-->>You: 200 OK
    You->>Reducto API: 3. Use file_id with Parse/Split/Extract
```

1. **Request a presigned URL** from Reducto (no file attached)
2. **Upload your file** directly to cloud storage using the presigned URL
3. **Use the file\_id** with Parse, Split, or Extract endpoints

***

## Step 1: Request a Presigned URL

Call the upload endpoint *without* attaching a file:

<CodeGroup>
  ```python Python theme={null}
  import os
  import requests

  response = requests.post(
      "https://platform.reducto.ai/upload",
      headers={"Authorization": f"Bearer {os.environ.get('REDUCTO_API_KEY')}"}
  )

  data = response.json()
  file_id = data["file_id"]
  presigned_url = data["presigned_url"]

  print(f"File ID: {file_id}")
  print(f"Presigned URL: {presigned_url[:80]}...")
  ```

  ```javascript Node.js theme={null}
  const response = await fetch('https://platform.reducto.ai/upload', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.REDUCTO_API_KEY}`,
    },
  });

  const data = await response.json();
  const fileId = data.file_id;
  const presignedUrl = data.presigned_url;

  console.log(`File ID: ${fileId}`);
  console.log(`Presigned URL: ${presignedUrl.slice(0, 80)}...`);
  ```

  ```go Go theme={null}
  import (
      "encoding/json"
      "net/http"
      "os"
  )

  req, _ := http.NewRequest("POST", "https://platform.reducto.ai/upload", nil)
  req.Header.Set("Authorization", "Bearer "+os.Getenv("REDUCTO_API_KEY"))

  resp, _ := http.DefaultClient.Do(req)
  defer resp.Body.Close()

  var data struct {
      FileID       string `json:"file_id"`
      PresignedURL string `json:"presigned_url"`
  }
  json.NewDecoder(resp.Body).Decode(&data)

  fmt.Printf("File ID: %s\n", data.FileID)
  ```

  ```bash cURL theme={null}
  curl -X POST https://platform.reducto.ai/upload \
    -H "Authorization: Bearer $REDUCTO_API_KEY"
  ```
</CodeGroup>

**Response:**

```json theme={null}
{
  "file_id": "reducto://50c07046-3bac-4844-8c4b-d1428ed9c8f4",
  "presigned_url": "https://prod-storage.s3.amazonaws.com/50c07046-3bac-4844-8c4b-d1428ed9c8f4?X-Amz-Algorithm=AWS4-HMAC-SHA256&..."
}
```

<Warning>
  **Save the `file_id` now.** You'll need it in Step 3. The presigned URL is only for uploading — you can't use it to process the document.
</Warning>

***

## Step 2: Upload to Presigned URL

Upload your file using a PUT request to the presigned URL:

<CodeGroup>
  ```python Python theme={null}
  import requests

  with open("large_document.pdf", "rb") as f:
      response = requests.put(presigned_url, data=f)

  if response.status_code == 200:
      print("Upload successful!")
  ```

  ```javascript Node.js theme={null}
  import fs from 'fs';

  const fileBuffer = fs.readFileSync('large_document.pdf');

  const response = await fetch(presignedUrl, {
    method: 'PUT',
    body: fileBuffer,
  });

  if (response.ok) {
    console.log('Upload successful!');
  }
  ```

  ```go Go theme={null}
  import (
      "bytes"
      "io/ioutil"
  )

  fileBytes, _ := ioutil.ReadFile("large_document.pdf")

  req, _ := http.NewRequest("PUT", presignedUrl, bytes.NewReader(fileBytes))

  resp, _ := http.DefaultClient.Do(req)
  if resp.StatusCode == 200 {
      fmt.Println("Upload successful!")
  }
  ```

  ```bash cURL theme={null}
  curl -X PUT "$PRESIGNED_URL" -T large_document.pdf
  ```
</CodeGroup>

<Tip>
  **No Content-Type header needed.** When uploading to presigned URLs, you don't need to set a Content-Type header — the file will be accepted as-is.
</Tip>

***

## Step 3: Process with Parse, Split, or Extract

Use the `file_id` from Step 1 (not the presigned URL) with any Reducto endpoint:

<CodeGroup>
  ```python Python theme={null}
  from reducto import Reducto

  client = Reducto()

  # Use the file_id from Step 1
  result = client.parse.run(input=file_id)

  print(f"Processed {result.usage.num_pages} pages")
  ```

  ```javascript Node.js theme={null}
  import Reducto from 'reductoai';

  const client = new Reducto();

  // Use the fileId from Step 1
  const result = await client.parse.run({ input: fileId });

  console.log(`Processed ${result.usage.num_pages} pages`);
  ```

  ```go Go theme={null}
  import (
      reducto "github.com/reductoai/reducto-go-sdk"
      "github.com/reductoai/reducto-go-sdk/option"
      "github.com/reductoai/reducto-go-sdk/shared"
  )

  client := reducto.NewClient(option.WithAPIKey(os.Getenv("REDUCTO_API_KEY")))

  // Use the FileID from Step 1
  result, _ := client.Parse.Run(context.Background(), reducto.ParseRunParams{
      ParseConfig: reducto.ParseConfigParam{
          DocumentURL: reducto.F[reducto.ParseConfigDocumentURLUnionParam](
              shared.UnionString(data.FileID),  // file_id from Step 1
          ),
      },
  })

  fmt.Printf("Processed %d pages\n", result.Usage.NumPages)
  ```

  ```bash cURL theme={null}
  # Use the FILE_ID from Step 1
  curl -X POST https://platform.reducto.ai/parse \
    -H "Authorization: Bearer $REDUCTO_API_KEY" \
    -H "Content-Type: application/json" \
    -d "{\"input\": \"$FILE_ID\"}"
  ```
</CodeGroup>

***

## Complete Example

Here's the full workflow in one script:

<CodeGroup>
  ```python Python theme={null}
  import os
  import requests
  from reducto import Reducto

  # Step 1: Get presigned URL
  response = requests.post(
      "https://platform.reducto.ai/upload",
      headers={"Authorization": f"Bearer {os.environ.get('REDUCTO_API_KEY')}"}
  )
  data = response.json()
  file_id = data["file_id"]
  presigned_url = data["presigned_url"]

  # Step 2: Upload to presigned URL
  with open("large_document.pdf", "rb") as f:
      requests.put(presigned_url, data=f)

  # Step 3: Process with Reducto
  client = Reducto()
  result = client.parse.run(input=file_id)

  print(f"Successfully processed {result.usage.num_pages} pages")
  for chunk in result.result.chunks:
      print(chunk.content[:200])
  ```

  ```javascript Node.js theme={null}
  import Reducto from 'reductoai';
  import fs from 'fs';

  // Step 1: Get presigned URL
  const uploadResponse = await fetch('https://platform.reducto.ai/upload', {
    method: 'POST',
    headers: { 'Authorization': `Bearer ${process.env.REDUCTO_API_KEY}` },
  });
  const { file_id: fileId, presigned_url: presignedUrl } = await uploadResponse.json();

  // Step 2: Upload to presigned URL
  await fetch(presignedUrl, {
    method: 'PUT',
    body: fs.readFileSync('large_document.pdf'),
  });

  // Step 3: Process with Reducto
  const client = new Reducto();
  const result = await client.parse.run({ input: fileId });

  console.log(`Successfully processed ${result.usage.num_pages} pages`);
  ```

  ```bash cURL theme={null}
  #!/bin/bash

  # Step 1: Get presigned URL
  UPLOAD_RESPONSE=$(curl -s -X POST https://platform.reducto.ai/upload \
    -H "Authorization: Bearer $REDUCTO_API_KEY")

  FILE_ID=$(echo $UPLOAD_RESPONSE | jq -r '.file_id')
  PRESIGNED_URL=$(echo $UPLOAD_RESPONSE | jq -r '.presigned_url')

  # Step 2: Upload to presigned URL
  curl -X PUT "$PRESIGNED_URL" -T large_document.pdf

  # Step 3: Process with Reducto
  curl -X POST https://platform.reducto.ai/parse \
    -H "Authorization: Bearer $REDUCTO_API_KEY" \
    -H "Content-Type: application/json" \
    -d "{\"input\": \"$FILE_ID\"}"
  ```
</CodeGroup>

***

## Troubleshooting

<AccordionGroup>
  <Accordion title="403 Forbidden on presigned URL">
    **Cause:** The presigned URL has expired.

    **Fix:** Presigned URLs expire after a short time (typically 1 hour). Request a new presigned URL and try again.
  </Accordion>

  <Accordion title="Upload succeeds but Parse fails">
    **Cause:** You might be passing the `presigned_url` instead of the `file_id`.

    **Fix:** Always use the `file_id` (starts with `reducto://`) with Parse, Split, or Extract — not the presigned URL.
  </Accordion>

  <Accordion title="Timeout during upload">
    **Cause:** Large files on slow connections can timeout.

    **Fix:**

    * Use a wired connection if possible
    * Consider chunked/multipart upload for files >1GB
    * Implement retry logic with exponential backoff
  </Accordion>

  <Accordion title="Unexpected upload errors">
    **Cause:** Using incompatible upload methods or headers.

    **Fix:**

    * Don't include a Content-Type header — presigned URLs don't require it
    * For cURL, use `-T filename` instead of `--data-binary @filename`
    * In Go, use `bytes.NewReader()` to ensure proper Content-Length handling
  </Accordion>
</AccordionGroup>

***

## Related

<CardGroup cols={2}>
  <Card title="Direct Upload" icon="file-arrow-up" href="/upload">
    For files under 100MB — simpler, one-step upload.
  </Card>

  <Card title="Parse" icon="file-lines" href="/parse">
    Extract text, tables, and figures from uploaded documents.
  </Card>

  <Card title="Batch Processing" icon="layer-group" href="/workflows/batch-processing">
    Process many large files in parallel.
  </Card>

  <Card title="Async Processing" icon="clock" href="/workflows/async-overview">
    Use webhooks for long-running jobs.
  </Card>
</CardGroup>
