PoCAT Documentation Get Started

High Availability Setup

Replica, anti-affinity, RTO/RPO-based HA design and verification.

Last updated: 2026-05-27 Section: Architecture

HA is not just more replicas—it designs fault isolation, automatic failover, and RTO/RPO together. Spread gateway instances across availability zones to remove single points of failure.

Production requirement Production gateways default to 3 replicas (minimum 2), anti-affinity, and load-balancer health checks.

Production minimum baseline

  • Gateway replica 3 + pod/node anti-affinity
  • Automatic failover tied to load-balancer health checks
  • Deploy: rolling update, maxUnavailable: 0
  • Monthly runbook-based failure simulation

Enable high availability

  1. Use at least two AZs in infrastructure.
  2. Apply replica ≥ 3 and anti-affinity on Kubernetes/Swarm.
  3. Separate readiness and liveness probes to avoid bad traffic.
  4. Centralize config and secrets via ConfigMap/Secret or external vault.
  5. Monthly instance/AZ failure tests with recorded RTO.

Operations checklist

ItemFrequencyPass criteria
Instance failure injectionWeeklyTraffic stays up
AZ failure simulationMonthlyRTO within 10 min
Rollback rehearsalPre-deployHealthy within 5 min