2024-11-09 Setting Up HA PostgreSQL with repmgr

As mentioned previously, I'd like to setup a database with high availability for a webapp. I will try PostgreSQL first, based on the Nebula overlay network I just set up.

Search "PostgreSQL", "HA" and "automatic failover" and browse through the abundant results. For example this gives some comprehensive introduction. There are three major choices: PAF, Patroni, and repmgr. As the goal is to have some quick hands-on experience on high availability setup for PostgreSQL. Considering only open-source solutions with low resource demand, I will try repmgr as my first pick, also because PAF has the weakness of a single point of failure regarding its monitoring node.

Installation

Look for on-line resources first, and see if there are results that can help us gain insights, for example this and this. As I have used PostgreSQL in many project setups before, so I follow the Official Image site and create the docker-compose.yaml as:
networks:
  mynet:
    name: mynet
    external: true

services:
  postgres:
    restart: unless-stopped
    image: postgres:16
    container_name: postgres
    environment:
      POSTGRES_PASSWORD: mypass
      # POSTGRES_USER: postgres  # default
      # POSTGRES_DB: postgres    # default $POSTGRES_USER
      PGUSER: postgres           # or psql will use 'root'
    networks:
      - mynet
    ports:
      - '5432:5432'
    volumes:
      - "./data:/var/lib/postgresql/data"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready", "-d", "postgres", "-U", "postgres"]
      interval: '10s'
      timeout: '10s'
      retries: 5
      start_period: '20s'

  adminer:
    image: adminer
    restart: always
    ports:
      - 8080:8080

I choose the postgres:16 image instead of postgres:17-alpine because repmgr currently supports up to PostgreSQL 16 and it is easier to install using Debian-based images.

After starting with Docker Compose, we can shell into the postgres container to access the database using psql:

$ docker-compose exec -it postgres bash
abd6bf2309ad:/# psql -U postgres
psql (16.4 (Debian 16.4-1.pgdg120+2))
Type "help" for help.

postgres=#

Or just use psql to connect to the database on the host if you have installed it:

psql -h localhost -U postgres

Either way, you can change the password for postgres, which is the super user for the setup above:

postgres=# \password postgres
Enter new password: <new-password>
postgres=# \q

We can also access the adminer site at http://192.168.x.x:8080 , here assuming the host is on the local LAN:

In actual setup, I may use another port instead of 5432 and skip adminer .

Via the VPN Overlay

Since I have created the mesh VPN 192.168.111.0/24 using Nebula, I'd like to make sure I can connect to the database from other nebula nodes, one of which is behind NAT. However,

psql -h 192.168.111.x -U postgres

has no response. After checking the firewall settings on the machines, it turns out the firewall settings of Nebula I set up before are too strict, so I change the firewall rules in the configuration config.yml:

firewall:
  ...
  inbound:
    # Allow icmp between any nebula hosts
    - port: any
      proto: icmp
      host: any

    - port: any
      proto: tcp
      host: any

After restarting the network, the command above works.

Later I will re-design the Nebula nodes to have proper groups for better firewall setup. I will also bind the database port to the IP 192.168.111.x to make sure all database traffic get through the overlay network.

repmgr

Now come to high availability. I use the same docker-compose.yaml on the nebula nodes – one of which is behind NAT. So assuming we have three nodes which we are going to install PostgreSQL HA using docker-compse.yaml:

node1 192.168.111.1 - primary
node2 192.168.111.2 - standby
node3 192.168.111.3 - standby

Log in to each node and see if you can ping each other via the 192.168.111.0/24 network.

On node1 which we will install the primary database:

mkdir postgres-ha
cd postgres-ha

To get started, we can not use the docker-compose.yaml described above directly, since we need to install repmgr package in the container. Docker Compose will remove the containers and associated anonymous volumes once docker-compose down is executed. Although we can use docker-compose stop to avoid this, but it is not safe. I type docker-compose down too often.

For this, we need to create a image. See this for basic procedure. So first create a Dockerfile in the folder:

FROM postgres:16

RUN apt update -y && \
    apt install  -y \
      iputils-ping net-tools vim \
      postgresql-16-repmgr

and create the docker-compose.yaml file:

networks:
  mynet:
    name: mynet
    external: true

services:
  postgres-ha:
    restart: unless-stopped
    build: .
    container_name: postgres-ha
    environment:
      POSTGRES_PASSWORD: mypass
      PGUSER: postgres
    networks:
      - mynet
    ports:
      - '5432:5432'
    volumes:
      - "./data:/var/lib/postgresql/data"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready", "-d", "postgres", "-U", "postgres"]
      interval: '10s'
      timeout: '10s'
      retries: 5
      start_period: '20s'

Launch the container docker-compose up -d, and shell into it, log in as postgres and create both the role repmgr and database repmgr :

$ docker-compose exec -it postgres-ha bash
root:/# su postgres
postgres:/$ createuser -s repmgr
postgres:/$ createdb repmgr -O repmgr

Edit the postgresql.conf file at /var/lib/postgres/data/ with the following changes:

shared_preload_libraries = 'repmgr'
wal_level = replica
archive_mode = on
archive_command = '/bin/true'
max_wal_senders = 10
max_replication_slots = 10
hot_standby = on
listen_addresses = '*'

Edit pg_hba.conf for the part of entries like this:

# "local" is for Unix domain socket connections only
local   all           all                                       peer
# IPv4 local connections:
host    all           all               127.0.0.1/32            scram-sha-256
# IPv6 local connections:
host    all           all               ::1/128                 scram-sha-256
# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication   repmgr                                    trust
host    replication   repmgr            127.0.0.1/32            trust
host    replication   repmgr            192.168.111.0/24           trust
local   repmgr        repmgr                                    trust
host    repmgr        repmgr            127.0.0.1/32            trust
host    repmgr        repmgr            192.168.111.0/24           trust
The creation of the role repmgr, database repmgr, and the two configuration files only needs to be done on the primary database. Later they will be "cloned" from other nodes as standbys. The pg_hba.conf file contains specification of client authorization. Later we will see it allows local or remote nodes to connect to the database without password.

Now we can stop the container:

docker-compose down

We need to create another configuration file – repmgr.conf in the /etc folder. Other places are also OK. We just need to make it persist. Create a local etc folder

mkdir etc

and create the repmgr.conf file in it:

node_id=1
node_name=node1
conninfo='host=192.168.111.1 port=5432 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/data'
failover=automatic
promote_command='repmgr -f /etc/repmgr.conf standby promote'
follow_command='repmgr -f /etc/repmgr.conf standby follow'
repmgrd_service_start_command='repmgrd -f /etc/repmgr.conf -d'
repmgrd_service_stop_command='kill $(cat /tmp/repmgrd.pid)'

We need to bind this file to /etc/repmgr.conf in the container, so update the docker-compose.yaml :

...
  volumes:
    - "./data:/var/lib/postgresql/data"
    - type: bind
      source: ./etc/repmgr.conf
      target: /etc/repmgr.conf
...

We can start the container now, and shell into it again:

docker-compose up -d
docker-compose exec -it postgres-ha bash

cat /etc/repmgr.conf to see if it is bound correctly, and fix the ownership of /etc/repmgr.conf (may not be necessary):

chown root:root /etc/repmgr.conf
chmod a+r /etc/repmgr.conf

At this stage make sure we can connect to the database with both user and database asrepmgr without password:

root:/# psql -h 192.168.111.1 -U repmgr -d repmgr
psql (16.4 (Debian 16.4-1.pgdg120+2))
Type "help" for help.

repmgr=# \q

Finally we can start the primary server. First log in as postgres:

su postgres

Then run the command:

postgres:/$ repmgr -f /etc/repmgr.conf primary register
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: primary node record (ID: 1) registered

To see if things go well, run the command below to see whether node1 is running as the primary:

postgres:/$ repmgr -f /etc/repmgr.conf cluster show
ID | Name   | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
...
 1 | node1  | primary | * running |          | default  | 100      | 1        | host=192.168.111.1 port=5432 user=repmgr dbname=repmgr connect_timeout=2

Set up the Standby Database

Now that the primary database is running. Log on to the next node node2 and create the folder:

mkdir postgres-ha
cd postgres-ha

Also copy the Dockerfile, docker-compose.yaml and etc/repmgr.conf files over. However, we need change the docker-compose.yaml

services:
  postgres-ha:
    restart: unless-stopped
    build: .
    entrypoint: 'sleep infinity'
    container_name: postgres-ha
    ...

That is, add an entry entrypoint: 'sleep infinity' to prevent PostgreSQL from running. We should modify etc/repmgr.conf too:

node_id=2
node_name=node2
conninfo='host=192.168.111.2 port=5432 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/data'
failover=automatic
promote_command='repmgr -f /etc/repmgr.conf standby promote'
follow_command='repmgr -f /etc/repmgr.conf standby follow'
repmgrd_service_start_command='repmgrd -f /etc/repmgr.conf -d'
repmgrd_service_stop_command='kill $(cat /tmp/repmgrd.pid)'

Now start the container and shell into it:

docker-compose up -d
docker-compose exec -it postgres-ha bash

Make sure PostgreSQL is not running:

root:/# psql
psql: error: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: No such file or directory
	Is the server running locally and accepting connections on that socket?

and that we can connect to the primary database without password:

root:/# psql -h 192.168.111.1 -p 5432 -U repmgr -d repmgr
psql (16.4 (Debian 16.4-1.pgdg120+2))
Type "help" for help.

repmgr=# \q
If you cannot connect, it is probably the settings in pg_hba.conf on the primary database. Check if 192.168.111.0/24 is trusted.

Let's fix the ownership of the data folder /var/lib/postgresql/data first, make sure it is empty, and similarly for /etc/repmgr.conf:

chown postgres:postgres /var/lib/postgresql/data
ls /var/lib/postgresql/data

chown root:root /etc/repmgr.conf
chmod a+r /etc/repmgr.conf

At this point we want to log in as postgres and clone the database from the primary (without local PostgresSQL running):

su postgres

then test if it is ready to clone:

repmgr -h 192.168.111.1 -p 5432 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --dry-run

If OK, the message looks like:

...
INFO: would execute:
  pg_basebackup -l "repmgr base backup"  -D /var/lib/postgresql/data -h 192.168.111.1 -p 5432 -U repmgr -X stream
INFO: all prerequisites for "standby clone" are met

and we can run the same command without the last --dry-run flag:

postgres:~$ repmgr -h 192.168.111.1 -p 5432 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone
NOTICE: destination directory "/var/lib/postgresql/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=192.168.111.1 port=5432 user=repmgr dbname=repmgr
...
HINT: after starting the server, you need to register this standby with "repmgr standby register"
postgres:~$

At this stage, the database is cloned over. We can stop the container now:

docker-comppose down

Comment out the line entrypoint: 'sleep infinity' in docker-compose.yaml, and restart the container (be sure to add the --build flag to force a rebuild):

docker-compose up -d --build

Shell into the container and you should be able to connect to local database without password:

$ docker-compose exec -it postgres-ha bash
root:/# psql
psql (16.4 (Debian 16.4-1.pgdg120+2))
Type "help" for help.

postgres=# \q
root:/# psql -h 192.168.111.2 -p 5432 -U repmgr -d repmgr
psql (16.4 (Debian 16.4-1.pgdg120+2))
Type "help" for help.

repmgr=# \q

Now we can register to the cluster as a standby:

su postgres

Then

postgres:/$ repmgr -f /etc/repmgr.conf standby register
INFO: connecting to local node "node2" (ID: 2)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID: 1)
INFO: standby registration complete
NOTICE: standby node "node2" (ID: 2) successfully registered
postgres:/$

Check the cluster status with:

postgres:/$ repmgr -f /etc/repmgr.conf cluster show
 ID | Name  | Role    | Status     | Upstream | Location | Priority | Timeline | Connection string
  2 | node2 | standby |   running  | node1    | default  | 100      | 1        | host=192.168.111.2 port=5432 user=repmgr dbname=repmgr connect_timeout=2
  1 | node1 | primary | * running  |          | default  | 100      | 1        | host=192.168.111.1 port=5432 user=repmgr dbname=repmgr connect_timeout=2

To Continue

The third node is set up similarly to node2, so that the cluster show command above should show three entries – one primary and two standbys. We need to make sure automatic failover is functioning:

  • docker-compose down one standby and see the status, and how the overall cluster works, then turn it back on.
  • docker-compose down the primary, and see if a standby is promoted as a new primary, then try to turn the server back on.
In the second case what would happen? In my case there would be two primaries running, which does not seem OK. So I turn it off again and search the web for clues. It turns out all nodes are equal – once a node become a primary, the off-lined nodes should join as standbys (without starting PostgreSQL directly). Therefore if the repmgr node rejoin command does not work for some reason, one probably needs to repeat the above process from scratch (meaning backup the data folder elsewhere and empty it) and join as a standby.

The setup so far is only part of the overall HA architecture. One big question is – how do clients access the cluster in a reliable and seamless way. There are more pieces to the puzzle – client-side multi-host connections and disconnect handling, adding tools such as PgPool or PgBouncer, etc. All these need to be considered and can only be left as future exercises.