How to Setup a Replica for PostgreSQL 9.x

The following steps I use to setup a new Postgresql 9.5 replica.

Why do I need them? The PostgreSQL docs are excellent in providing both an understanding of the topic of replication, as well as giving details on what to do. Still, I find myself asking: Great, I get it. What now?

I've used these steps to setup PostgreSQL 9.3 and 9.5 replicas on Ubuntu 16.04 (Xenial Xerus). They should work for the whole 9.x series, at least >9.3, though. Make sure to adapt the filepaths, if you use a version of Postgres that is different from 9.3.

Please don't just use the steps without understanding them. :) Also, use sudo or be root where needed.

Steps to set up the replica

  1. Prepare:

    1. Setup the master for replication.
    2. Note down the IP of the master server, the user name and the password for the replication user. In my case the user is called replication.
  2. Stop Postgres (as root):

    service postgresql stop
  3. Switch to postgres user:

    su - postgres
  4. Create an empty data directory:

    # This uses the Ubuntu default path
    cd /var/lib/postgresql/9.3/
    # Backup the old data-dir
    # (Make sure there's nothing there.)
    mv main main-backup
    # Create a new, empty data-dir.
    mkdir main
    chmod 700 main
  5. Create a recovery.conf (make sure to replace the placeholders for server IP, replication user and password and the port, if necessary:

    cd main
    echo "standby_mode='on'
    primary_conninfo='host=MASTER-IP port=5432 user=replication password=REPLICATION-PASSWORD sslmode=require'
    trigger_file='/tmp/pg_trigger.repl'" >recovery.conf
  6. Draw a base backup (make sure to replace the placeholders for server IP and replication user, and get ready to enter the replication user's password):

    pg_basebackup -h MASTER-IP -D /var/lib/postgresql/9.3/main -U replication -v -P
  7. Switch back to root and start PostgreSQL:

    service postgresql start

Steps to check the replication

  1. On the replica server, check that the WAL receiver is running:

    ps auxf | grep -v grep | grep "postgres: wal receiver process"

    This should show a postgres: wal receiver process process, for example:

    postgres  7839  0.0  0.0 323052 11116 ?        Ss   06:31   0:00  \_ postgres: wal receiver process   streaming 79/DF47B6C0

    If it does not show anything, check the log file:

    more /var/log/postgresql/postgresql-9.3-main.log
  2. Check the replication delay on the replica:

    su - postgres
    psql -c 'select now() - pg_last_xact_replay_timestamp() AS replication_delay;' | tail -n 3 | head -n 1

    It should show the number of seconds that the replica is behind the master in replicating:


    A few seconds are normal. This is the data you'd loose, if the master server gets shutdown abruptly. I've seen this number vary between 0 and 15. The only thing that matters is that it stays low. If the delay is ever-growing, your replica is not keeping up with the master and you need to investigate this, possible getting a faster machine or enlarging the network bandwidth.

    Also, you may see an error like this:

    psql: FATAL:  the database system is starting up

    In that case you may not have enabled hot_standby mode on the replica. This setting allows you to switch the replica live at a later point, i.e. it becomes a master server. For our use-case, it allows us to run queries, for example to determine the replication delay, which is otherwise not possible.

    To enable hot_standby, open /etc/postgresql/9.3/main/postgresql.conf, search for hot_standby and replace its value off to on:

    hot_standby = on                       # "on" allows queries during recovery
                                           # (change requires restart)
  3. Check the replication status on the master server, by running the following SQL statement:

    select client_addr, state, sent_location, write_location, flush_location, replay_location from pg_stat_replication;

    It should give you an output like so:

    client_addr   |   state   | sent_location | write_location | flush_location | replay_location
    --------------+-----------+---------------+----------------+----------------+----------------- | streaming | 79/DF6CECF8   | 79/DF6CECF8    | 79/DF6CECF8    | 79/DF6CECF8
    (1 row)

    If the IP under client_addr matches the IP of the replica's machine, all is well.