Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.reducto.ai/llms.txt

Use this file to discover all available pages before exploring further.

Hybrid VPC deployment provides a balance between data sovereignty and operational simplicity. Your data stays in your cloud account while Reducto manages all compute infrastructure.

Overview

In a Hybrid VPC deployment:
  • Data stays in your cloud account: All documents, intermediate artifacts, and results are stored in your storage
  • Compute runs on Reducto’s infrastructure: GPU processing and model inference are handled by Reducto
  • Stateless by design: Objects have a configurable lifecycle, ensuring no data persists beyond processing
  • Multiple storage providers: AWS S3, Azure Blob Storage, and Box are supported

AWS S3

Cross-account IAM role with ExternalId protection. Optional PrivateLink for private-only API access.

Azure Blob Storage

Cross-tenant service principal access with RBAC. Standard Azure security model.

Box

Box enterprise app with Client Credentials Grant. Ideal for organizations already using Box for document management.

Google Cloud Storage

Cross-project service account access. Standard GCP IAM model.

Key benefits

BenefitDescription
Data sovereigntyStorage remains in your cloud account
No GPU managementOffload model inference to Reducto’s optimized GPU cluster
Cost efficiencyAvoid provisioning and maintaining GPU capacity
Fast auto-scalingScale to zero when idle, scale up on demand
Reduced DevOps burdenFaster iteration, no infrastructure maintenance

Architecture

Data flow

  1. You upload documents to your storage (or use Reducto’s /upload endpoint)
  2. You call Reducto API with a reference to your document
  3. Reducto uses your configured credentials to access the document
  4. Processing occurs on Reducto’s compute infrastructure
  5. Results and artifacts are written back to your storage
  6. Objects expire automatically based on your lifecycle configuration

Choosing a Storage Provider

Best choice if your organization uses Azure. Uses cross-tenant service principal with RBAC role assignments. Terraform configuration provided for automated setup.
Best choice if your organization already manages documents in Box. Uses Box enterprise app authentication (Client Credentials Grant). No Terraform provider available — setup is done through the Box Admin Console.
Best choice if your organization uses GCP. Uses cross-project service account access with IAM bindings. Contact Reducto for setup guidance.

Document Handoff

There are multiple ways to provide documents to Reducto APIs, regardless of which storage provider you use:
Use Reducto’s /upload endpoint to upload documents directly. Files are automatically stored in your configured storage:
from pathlib import Path
from reducto import Reducto

client = Reducto(api_key="your-api-key")

upload_response = client.upload(file=Path("contract.pdf"))
result = client.parse.run(document_url=upload_response.url)
For PrivateLink connections (AWS only), specify the region-specific hybrid endpoint as base_url:
  • US: https://hybrid.platform.reducto.ai
  • EU: https://hybrid.eu.platform.reducto.ai
  • AU: https://hybrid.au.platform.reducto.ai

Integration Contract

After setting up your storage infrastructure, provide the following values to Reducto:
ValueDescription
bucket_nameS3 bucket name
regionAWS region (e.g., us-east-1)
role_arnIAM role ARN for Reducto to assume
external_idExternalId for secure role assumption
privatelink_endpoint_idVPC Endpoint ID (if using PrivateLink)

Multi-Region Setup

For organizations needing storage in multiple regions for latency or compliance requirements, see the provider-specific setup guides linked above. Each provider supports region-specific configurations that Reducto routes automatically based on the deployment area (US, EU, AU).

Security

All storage integrations follow least-privilege principles:
  • AWS: ExternalId prevents confused deputy attacks; IAM policy limits access to S3 operations only
  • Azure: RBAC role assignment scoped to the specific storage account/container
  • Box: App access restricted to the configured folder; enterprise admin approval required
  • All providers: Automatic data cleanup via configurable lifecycle policies