> ## Documentation Index
> Fetch the complete documentation index at: https://docs.reducto.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Hybrid VPC — Azure Blob Storage

> Set up Hybrid VPC with Azure Blob Storage for data sovereignty

This guide covers setting up Hybrid VPC with Azure Blob Storage as your storage backend. Reducto uses a cross-tenant service principal to read and write documents in your Azure Storage account.

## Prerequisites

* **Azure subscription** with permissions to create Storage accounts and role assignments
* **Terraform 1.2+** with the AzureRM provider (for automated setup)
* **Values from Reducto** (provided during onboarding):
  * Reducto's Azure AD application ID (for cross-tenant access)
  * Organization ID for configuration

## Architecture

```mermaid theme={null}
flowchart LR
    subgraph customer["Customer Azure Subscription"]
        direction TB
        blob["Azure Blob Storage<br/>(documents, artifacts)"]
        rbac["RBAC Role Assignment<br/>(Storage Blob Data Contributor)"]
        rbac --> blob
    end

    subgraph reducto["Reducto Infrastructure"]
        direction TB
        sp["Service Principal<br/>(multi-tenant app)"]
        workers["Compute Workers"]
        sp --> workers
    end

    workers <--> blob
```

## Setup

<Tabs>
  <Tab title="Terraform (recommended)">
    Reducto provides a Terraform example for Azure setup in the [hybrid infrastructure repository](https://github.com/reductoai-collab/reducto-hybrid-infra).

    <Steps>
      <Step title="Clone the infrastructure repository">
        ```bash theme={null}
        git clone https://github.com/reductoai-collab/reducto-hybrid-infra.git
        cd reducto-hybrid-infra/examples/azure
        ```
      </Step>

      <Step title="Create terraform.tfvars">
        ```hcl theme={null}
        name_prefix       = "reducto"
        location          = "eastus"
        resource_group_name = "reducto-hybrid-rg"

        # Provided by Reducto during onboarding
        reducto_service_principal_object_id = "<object-id-from-reducto>"

        tags = {
          Environment = "production"
          Project     = "reducto-hybrid"
        }
        ```
      </Step>

      <Step title="Initialize and apply">
        ```bash theme={null}
        terraform init
        terraform plan
        terraform apply
        ```
      </Step>

      <Step title="Share outputs with Reducto">
        ```bash theme={null}
        terraform output integration_values
        ```

        Provide the output values (storage account name, container name, connection string) to your Reducto team.
      </Step>
    </Steps>
  </Tab>

  <Tab title="Azure Portal (manual)">
    <Steps>
      <Step title="Create a Storage Account">
        1. Go to **Azure Portal** → **Storage Accounts** → **Create**
        2. Configure:
           * **Performance**: Standard
           * **Redundancy**: LRS (or your preferred level)
           * **Enable** hierarchical namespace if needed
        3. Under **Networking**, set **Public network access** to your preference
        4. Under **Data protection**, configure lifecycle management (recommended: 1-day expiry)
      </Step>

      <Step title="Create a Blob Container">
        1. Open the new Storage Account
        2. Go to **Containers** → **+ Container**
        3. Name: `reducto-documents`
        4. **Public access level**: Private
      </Step>

      <Step title="Grant Reducto access">
        1. Go to **Storage Account** → **Access Control (IAM)** → **Add role assignment**
        2. Role: **Storage Blob Data Contributor**
        3. Assign access to: **User, group, or service principal**
        4. Select the Reducto service principal (provided during onboarding)
      </Step>

      <Step title="Generate connection string">
        1. Go to **Storage Account** → **Access keys**
        2. Copy the **Connection string**
        3. Share with your Reducto team (securely)
      </Step>
    </Steps>
  </Tab>
</Tabs>

### Components provisioned

| Component                 | Purpose                                           | Required    |
| ------------------------- | ------------------------------------------------- | ----------- |
| Storage Account           | Azure Blob Storage with lifecycle policies        | Yes         |
| Blob Container            | Container for documents and artifacts             | Yes         |
| RBAC Role Assignment      | Cross-tenant access for Reducto service principal | Yes         |
| Lifecycle Management Rule | Automatic blob expiry (default: 1 day)            | Recommended |

## Integration Values

After setup, provide these values to Reducto:

| Value                  | Description               | Where to find                  |
| ---------------------- | ------------------------- | ------------------------------ |
| `storage_account_name` | Storage account name      | Azure Portal → Storage Account |
| `container_name`       | Blob container name       | Storage Account → Containers   |
| `connection_string`    | Storage connection string | Storage Account → Access keys  |

## Data Lifecycle

Configure lifecycle management to automatically delete blobs after processing:

```json theme={null}
{
  "rules": [
    {
      "name": "auto-expire",
      "type": "Lifecycle",
      "definition": {
        "actions": {
          "baseBlob": {
            "delete": {
              "daysAfterModificationGreaterThan": 1
            }
          }
        },
        "filters": {
          "blobTypes": ["blockBlob"]
        }
      }
    }
  ]
}
```

## Security

* **Cross-tenant access**: Reducto's service principal is granted only `Storage Blob Data Contributor` on the specific container
* **No shared keys required**: RBAC-based access is more secure than shared key authentication
* **Network restrictions**: Optionally restrict access to specific IP ranges or virtual networks
* **Automatic cleanup**: Lifecycle management policies ensure no long-term data persistence
