Documentation Index
Fetch the complete documentation index at: https://docs.reducto.ai/llms.txt
Use this file to discover all available pages before exploring further.
Connection architecture
Each Reducto pod (HTTP server or worker) maintains its own SQLAlchemy connection pool to PostgreSQL. Connections are not shared across pods.| Pod type | Processes per pod | Pool size per process | Max connections per pod |
|---|---|---|---|
| HTTP | 8 gunicorn workers (default) | pool_size + max_overflow | 8 × (4 + 8) = 96 |
| Worker | 1 | pool_size + max_overflow | 4 + 8 = 12 |
HTTP_WORKERS environment variable (default: 8).
Default pool settings
These are the application-level defaults for on-premise deployments:| Setting | Default | Description |
|---|---|---|
DB_POOL_SIZE | 4 | Persistent connections kept open per process |
DB_MAX_OVERFLOW | 8 | Additional connections created under load (closed when idle) |
DB_POOL_TIMEOUT | 30 (seconds) | Max time to wait for a connection from the pool |
DB_POOL_PRE_PING | true | Validate connections before use (handles stale connections) |
DB_POOL_RECYCLE | -1 (disabled) | Max connection lifetime in seconds before recycling |
DB_LOCK_TIMEOUT_MS | 15000 | PostgreSQL lock_timeout per transaction |
DB_STATEMENT_TIMEOUT_MS | 20000 | PostgreSQL statement_timeout per transaction |
Connection pooling with PgBouncer or RDS Proxy
We strongly recommend running an external connection pooler between Reducto and PostgreSQL. Without one, connection storms during pod scaling (especially KEDA-driven autoscaling) can overwhelm the database.Azure (built-in PgBouncer)
Azure Database for PostgreSQL Flexible Server includes a built-in PgBouncer. Our Azure on-prem Terraform module enables it by default:6432 (PgBouncer) instead of 5432 (direct PostgreSQL). No application-level changes are needed.
To verify PgBouncer is active, check the Azure Portal under your PostgreSQL Flexible Server > Server parameters > pgbouncer.enabled.
AWS (RDS Proxy)
Our AWS on-prem Terraform module provisions an RDS Proxy by default. The Helm chart automatically uses the pooled database URL:Estimating total database connections
To estimate your peak connection count:Right-sizing for your workload
If you see connection timeout errors or pool exhaustion:- Increase
DB_POOL_SIZEif connections are frequently at capacity during steady state - Increase
DB_MAX_OVERFLOWif you see spikes during burst traffic - Decrease
DB_POOL_RECYCLE(e.g.,300) if you’re behind a pooler that has its own idle timeout — this prevents the application from trying to use connections the pooler has already closed
DB_POOL_PRE_PING set to true. This ensures the application validates connections before use, which is important when the pooler may close idle backend connections.
Timeout tuning
Thelock_timeout and statement_timeout values are set per transaction using SET LOCAL, which is compatible with all connection poolers (direct, PgBouncer, RDS Proxy).
If you process very large documents (100+ pages) and see timeout errors, you may want to increase these:
statement_timeout higher than lock_timeout so that lock contention surfaces as a lock timeout rather than a generic statement timeout.