Skip to main content
The upload method uploads files to Reducto’s servers and returns a file reference that you can use with other endpoints.

Basic Usage

from pathlib import Path
from reducto import Reducto

client = Reducto()

# Upload a file
upload = client.upload(file=Path("document.pdf"))

# Use the file_id in other operations
print(upload.file_id)  # "reducto://abc123def456.pdf"

Method Signature

def upload(
    file: Path | bytes | tuple[str, bytes, str] | BinaryIO,
    extension: str | None = None
) -> Upload

Parameters

ParameterTypeRequiredDescription
filePath | bytes | tuple | BinaryIOYesThe file to upload. Can be a Path object, file-like object, bytes, or a tuple of (filename, contents, media_type).
extensionstr | NoneNoOverride the file extension (e.g., ".pdf"). Useful when uploading bytes without filename context.

Returns

Upload with the following fields:
  • file_id (str): The file reference to use with other endpoints (format: reducto://...)
  • presigned_url (str | None): A presigned URL for the uploaded file (if applicable)

Upload Options

From File Path

The most common way to upload. Use pathlib.Path:
from pathlib import Path

upload = client.upload(file=Path("invoice.pdf"))

From File Object

with open("document.pdf", "rb") as f:
    upload = client.upload(file=f)

From Bytes

When uploading raw bytes, use a tuple to provide filename and content type:
with open("document.pdf", "rb") as f:
    file_bytes = f.read()

upload = client.upload(
    file=("document.pdf", file_bytes, "application/pdf")
)

With Extension Override

Use extension when the file type cannot be inferred:
upload = client.upload(
    file=Path("data_file"),
    extension=".pdf"
)

From URL (Presigned URLs)

For large files, you can generate a presigned URL and pass it directly to parse/extract endpoints without uploading:
# For files > 100MB, use presigned URLs
# See: /upload/large-files for details

# You can pass presigned URLs directly to parse/extract
result = client.parse.run(input="https://bucket.s3.amazonaws.com/doc.pdf?X-Amz-...")

Async Upload

The async client uses the same interface:
import asyncio
from reducto import AsyncReducto

async def main():
    client = AsyncReducto()
    
    # If you pass a PathLike instance, the file contents will be read asynchronously automatically
    upload = await client.upload(file=Path("document.pdf"))
    return upload

result = asyncio.run(main())

File Size Limits

  • Direct upload: Up to 100MB
  • Presigned URLs: Up to 5GB
For files larger than 100MB, use presigned URLs instead of the upload endpoint.

Large File Upload Guide

Complete guide to uploading large files using presigned URLs.

Supported File Types

The SDK accepts any file type that Reducto supports. Common formats include:
  • Documents: PDF, DOCX, DOC, RTF
  • Images: PNG, JPEG, TIFF, BMP
  • Spreadsheets: XLSX, XLS, CSV
  • Presentations: PPTX, PPT

Supported File Formats

Complete list of supported file formats.

Error Handling

from reducto import Reducto
import reducto

try:
    upload = client.upload(file=Path("document.pdf"))
except reducto.APIConnectionError as e:
    print(f"Connection failed: {e}")
except reducto.APIStatusError as e:
    print(f"Upload failed: {e.status_code} - {e.response}")
Common errors:
  • File not found: The file path doesn’t exist
  • File too large: File exceeds 100MB limit (use presigned URLs)
  • Invalid file type: File format not supported
  • Network error: Connection issues during upload

Examples

Upload and Parse

from pathlib import Path
from reducto import Reducto

client = Reducto()

# Upload
upload = client.upload(file=Path("invoice.pdf"))

# Parse immediately
result = client.parse.run(input=upload.file_id)

Batch Upload

from pathlib import Path
from reducto import Reducto

client = Reducto()
files = ["doc1.pdf", "doc2.pdf", "doc3.pdf"]

uploads = []
for file_path in files:
    upload = client.upload(file=Path(file_path))
    uploads.append(upload)
    print(f"Uploaded {file_path}: {upload.file_id}")

Upload with Extension Override

from pathlib import Path
from reducto import Reducto

client = Reducto()

# Upload with explicit extension
upload = client.upload(
    file=Path("document_without_extension"),
    extension=".pdf"
)

print(upload.file_id)  # "reducto://..."

Best Practices

Use Path Objects

Prefer pathlib.Path over string paths for better cross-platform compatibility.

Handle Large Files

For files > 100MB, use presigned URLs instead of direct upload.

Reuse File IDs

File IDs can be reused across multiple operations without re-uploading.

Error Handling

Always wrap uploads in try/except blocks for production code.

Next Steps