Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.reducto.ai/llms.txt

Use this file to discover all available pages before exploring further.

This guide covers setting up Hybrid VPC with Azure Blob Storage as your storage backend. Reducto uses a cross-tenant service principal to read and write documents in your Azure Storage account.

Prerequisites

  • Azure subscription with permissions to create Storage accounts and role assignments
  • Terraform 1.2+ with the AzureRM provider (for automated setup)
  • Values from Reducto (provided during onboarding):
    • Reducto’s Azure AD application ID (for cross-tenant access)
    • Organization ID for configuration

Architecture

Setup

Components provisioned

ComponentPurposeRequired
Storage AccountAzure Blob Storage with lifecycle policiesYes
Blob ContainerContainer for documents and artifactsYes
RBAC Role AssignmentCross-tenant access for Reducto service principalYes
Lifecycle Management RuleAutomatic blob expiry (default: 1 day)Recommended

Integration Values

After setup, provide these values to Reducto:
ValueDescriptionWhere to find
storage_account_nameStorage account nameAzure Portal → Storage Account
container_nameBlob container nameStorage Account → Containers
connection_stringStorage connection stringStorage Account → Access keys

Data Lifecycle

Configure lifecycle management to automatically delete blobs after processing:
{
  "rules": [
    {
      "name": "auto-expire",
      "type": "Lifecycle",
      "definition": {
        "actions": {
          "baseBlob": {
            "delete": {
              "daysAfterModificationGreaterThan": 1
            }
          }
        },
        "filters": {
          "blobTypes": ["blockBlob"]
        }
      }
    }
  ]
}

Security

  • Cross-tenant access: Reducto’s service principal is granted only Storage Blob Data Contributor on the specific container
  • No shared keys required: RBAC-based access is more secure than shared key authentication
  • Network restrictions: Optionally restrict access to specific IP ranges or virtual networks
  • Automatic cleanup: Lifecycle management policies ensure no long-term data persistence