PostgreSQL High Availability

PostgreSQL on Thalassa Cloud DBaaS can run with multiple instances for high availability. Replicas are spread across availability zones, and the platform handles failover and switchover when the primary instance fails. This guide explains how HA works and how to size it.

How high availability works

  • Single instance — With one instance, your data is still durable: block storage is replicated across all availability zones. A single instance does not provide automatic failover; if the instance fails, recovery depends on restart and support.
  • Multiple instances — When you set instance count > 1, the DBaaS configures replica instances. One instance is the primary; the others are replicas that stay in sync via PostgreSQL streaming replication.
  • Availability zone spread — Instances are automatically spread across all availability zones in the region. That limits the impact of an AZ failure and keeps replicas on different failure domains.
  • Failover and switchover — If the primary instance fails, the platform detects the failure and promotes an instance in another availability zone to primary (failover). Planned promotion of a replica to primary is a switchover. Applications should reconnect to the cluster endpoint; the platform routes traffic to the current primary.
  • Availability zone and host failure — If an availability zone fails, any instance running in that AZ will be moved to a different zone by the platform. The same applies to physical hosts: if the host an instance runs on fails, the instance will be moved to a different host. This relocation is handled automatically.

Sizing HA: how many instances?

  • Two instances — Running 2 instances is usually enough for high availability. It covers failure of one instance and reduces downtime during maintenance (e.g. upgrades, node replacement). One replica is sufficient for automatic failover.
  • More than two — You can run 3 instances (or up to the number of availability zones in your region). We advise no more than 3 instances unless your region has more AZs and you have a specific need. Additional replicas add replication overhead; for most workloads, 2 instances strike a good balance between HA and cost.

Instance failure and recovery

  • Automatic restart — In most cases, a failed instance restarts automatically. The platform will try to bring the instance back and rejoin it to the cluster.
  • Support — If an instance does not recover automatically, the Thalassa Cloud support team will investigate and recover it when needed. They will contact you using your emergency contact details. Ensure these are up to date in your account so support can reach you for critical issues.

Summary

GoalRecommendation
HA for failures and maintenanceRun 2 instances (primary + 1 replica).
Maximum instances3, or the number of AZs in your region, whichever is lower.
Single instanceData is still replicated at the storage layer across AZs; no automatic failover but will auto restart.

For backup and recovery configuration, see Storage and backups.