Linux support, onsite, online in the cloud

Systems Administration Notes.

Although there are many guides to streaming postgres database replication which are, in their way, quite helpful and which may point you to an ultimately successful installation, most that I have seen are also a tad misleading and to some extent quite wrong.

The most common error is to suppose that there are two or more postgresql.conf files involved, one for the master and one for the slave(s). When first setting up replication for a client I followed this path a couple of times and became concerned that I might have made a more fundamental error in the configuration of the system as each time I started up the pg_basebackup I found that the slave server’s configuration was blown away. I even tried using a different name for the replica server configuration only to discover that the entire data directory including the configuration files had to be removed for the backup to work.

The fact of the matter is that configuration and data are transferred from master to slave when the slave is initialised and the configuration for both master and slave are contained in the same postgresql.conf file. This means that settings for the slave as shown in the guides are often quite spurious. For example this is a segment copied from a guide found on the web:

Now on to the slave. In the slave’s postgresql.conf, add the following: wal_level = hot_standby max_wal_senders = 3 checkpoint_segments = 8 wal_keep_segments = 8 hot_standby = on

This is, I’m afraid, nonsense. Anything written to the data directory on the slave server, including postgresql.conf, will have to be deleted in order for pg_basebackup (which the author refers to later) to run, but let us look at those parameters in a little more detail.

The first 4 items relate to the "Write Ahead Log" or wal which is, as you might suppose written by the master server prior to changes to the data being committed.

Increasing the max_wal_senders from 0 enables replication from the master server.

The parameter checkpoint_segments is the number of log segments to be written before a checkpoint is written. Again this only relevant on the master and probably does not need adjusting when setting up replication.

The number of wal_keep_segments is the minimum number of log segments that should be retained even after the master has committed transactions. These segments can be then used by the replica server to catch up should there be a hiccup in streaming replication. It is an important parameter on a replicating master server.

The only parameter directly relevant to the slave is hot_standby, but even hot_standby is set in postgresql.conf on the master server, but is ignored by the master server. The same configuration file is however transferred to and used by, the slave. If you think about it this makes perfect sense because if or original master goes down the slave is going to become the new master and will need the same configuration as it’s predecessor.

If hot_standby is set to "on" then read only connections from the clients will be allowed by the replica even during "recovery".

If hot_standby is set to "off", the server will be still be active while replication is taking place but any attempt to connect by a client, will return an error message saying that the database is "starting up".

So how does the replica know that it is a replica or slave server if it is using the same postgresql.conf? The answer is the presence of another file, not found on the master, this file recovery.conf, tells the postgres server that it is a slavecarrying out a replication role.

After completing the pg_basebackup we need to create recovery.conf on the replica in the data directory. The file requires two parameters to be set, standby_mode and primary_conninfo. A third parameter trigger_file is also often added. The presence of the nominated trigger file ends recovery on the standby server. The trigger file is not however essential as the server may be promoted to the role of master server with the command pg_ctl promote.

The primary_conninfo string includes the host address or IP of the master server, the username, which should be a dedicated name that is only used for replication and the password which which will have been set on the master (see below). (The password can alternatively be stored in .pgpass file but this is unlikely to be any more secure and creating an additional, hidden file does make the setup a little less portable when you want the servers to change roles.

So before we initialise the replica server there are a couple more tiny steps to perform on the master.

First create the id for the replicator. On the master server change to the postgres user id and run psql to create a replication user with replication privileges and set a password

sudo -u postgres bash
psql
CREATE USER replicator REPLICATION LOGIN ENCRYPTED
PASSWORD ’<password>’;

Do note that the upper case characters for commands in postgres is by convention only. The commands can all be typed in lower case and the whole thing can rolled up into one mini script, a la:

sudo -u postgres psql -c "\
create user replicator replication \
login encrypted password ’thepassword’;"

Next we need to ensure that pg_basebackup and the slave server replication process can access the master server server. Access is configured in the Host Based Authentication configuration file or pg_hba.conf. The options are enumerated in the file itself but I suggest what you want is something like this appended to the end of the file:

host replication replicator 10.0.1.0/32 md5

The fields in order are: The type of connection ("host" = tcp/ip), the privileges (only replication is needed), the username, the slaver server IP, and the method of authentication.

You will notice that rather than a single IP address, in the example above I have given a network address. This is because as with postgresql.conf the file pg_hba.conf will be copied to the slave server when initialisation takes place. Using the network address means that if our current master goes down and we promote the slave to take over database services it can immediately become a new master and continue to serve slaves on that network without modifying the configuration. You pays ya money and ya takes your choice.

Ok. we are now ready to go. We are going to initialise the slave server’s data and configuration files, create the recovery.conf file and start up the slave.

To enact these steps we need to know where the data directory is going to be. To keep things simple I would recommend the use of the same path name as that on the master but the only requirement is that it must be the same as the PGDATA variable in postgres startup file. In the example below I have assumed a postgres 9.1 server for no better reason than I was recently working on such a system for a builders merchant in Ilkeston.

So here is the whole 9 yards to create a redundant replicating slave server.

1. Edit postgresql.conf on the master server ensure the following are present:

listen_address = ’*’    # make sure we’re listening as appropriate.
                        # In my set up I needed to listen on a ppp
                        # link to a slave, eth0’s ip 4’s interface
                        # for the database client machines and on
                        # localhost for web based database access.

wal_level = hot_standby # Set the write ahead log to hot standby mode.

max_wal_senders = 3     # 0 turns replication off. Seems to need a
                        # minimum of 3.

wal_keep_segments = 8   # The minimum number of log segments to be
                        # retained so that the slaves can catch up
                        # with changes if need be.


hot_standby = on        # this parameter will only be used by the
                        # slave(s)

2. Make sure pg_HBA.conf has the correct access settings for the replicator.

echo "host  replication  replicator  10.0.1.9/24 md5">> pg_hba.conf

3. Restart the master or reread the configuration files.

pg_ctl reread

4. Access the slave server. The necessary actions can be run as a script:

# Select the startup file in /etc/init.d
INITFILE=/etc/init.d/postgresql
# postgresql-9.1 or whatever.

# If postgres is already running stop it now
if ps -ef|grep -s postmaster ;then
     echo $INITFILE stop
fi

# change to postgres id
sudo -u postgres bash

# set the variable PGDATA to be the same as in
# the init file
export ‘grep PGDATA= $INITFILE‘

# clear any data / configuration files remaining
# from earlier experiments
[ -d $PGDATA ] && rm -r $PGDATA

# Initialis the replication/slave server with a
# streaming backup this also confirms your hba
# files and replicator user id are good.
# you will be prompted for the password
# -v sets verbose mode and -P a progress indicator.

pg_basebackup -h <master_ip> -D $PGDATA -u replicator -v -P

# Create recovery.conf
cd $PGDATA
cat > recovery.conf << - .
 standby_mode = ’on’
 primary_conninfo = ’host=10.0.1.3, user=replicator, password=<password>’
 trigger_file = ’/tmp/postgresql.trigger’

# return to our normal id
exit
sudo $INITFILE START

And there you are, Bob is your uncle.

Clifford W Fulford
19th March 2018.

The layout and associated style sheets for this page are taken from the World Wide Web Consortium and used here under the W3C software licence.

CITY LINUX ®

Linux support on-site, on-line and in the cloud.

Search

Linux / UNIX support

Linux Support Notes

Linux Applications & Tools

Support notes & ramblings

City Linux Location

Related sites.

Systems Administration Notes.

Postgres Database Replication.

Clifford W Fulford

Twitter feed.

Clients past and present.