官术网_书友最值得收藏!

Mixing streaming-based and file-based recovery

Life is not always just black or white. Sometimes, there are also some shades of gray. For some cases, streaming replication might be just perfect. In some other cases, file-based replication and PITR are all you need. But there are also many cases in which you need a bit of both. One example would be like this: when you interrupt replication for a long period of time, you might want to resync the slave using the archive again instead of performing a full base backup. It might also be useful to keep an archive around for some later investigation or replay operation.

The good news is that PostgreSQL allows you to actually mix file-based and streaming-based replication. You don't have to decide whether streaming-based or file-based is better. You can have the best of both worlds at the very same time.

How can you do that? In fact, you have already seen all the ingredients; we just have to put them together in the right way.

To make this easier for you, we have compiled a complete example.

The master configuration

On the master, we can use the following configuration in postgresql.conf:

wal_level = hot_standby
archive_mode = on
 # allows archiving to be done
 # (change requires restart)
archive_command = 'cp %p /archive/%f'
 # command to use to archive a logfile segment
 # placeholders: %p = path of file to archive
 # %f = file name only
max_wal_senders = 5
 # we used five here to have some spare capacity

In addition to this, we have to add some configuration lines to pg_hba.conf to allow streaming. Here is an example:

# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication hs trust
host replication hs 127.0.0.1/32 trust
host replication hs ::1/128 trust

host replication all 192.168.0.0/16 md5

In our case, we have simply opened an entire network to allow replication (to keep the example simple).

Once we have made these changes, we can restart the master and take a base backup as shown earlier in this chapter.

The slave configuration

Once we have configured our master and taken a base backup, we can start configuring our slave system. Let's assume for the sake of simplicity that we are using only a single slave; we will not cascade replication to other systems.

We only have to change a single line in postgresql.conf on the slave:

hot_standby = on # to make the slave readable

In the next step, we can write a simple recovery.conf file and put it into the main data directory:

restore_command = 'cp /archive/%f %p'
standby_mode = on
primary_conninfo = ' host=sample.postgresql-support.de port=5432 '
trigger_file = '/some_path/start_me_up.txt'

When we fire up the slave, the following will happen:

  1. PostgreSQL will call restore_command to fetch the transaction log from the archive.
  2. It will do so until no more files can be found in the archive.
  3. PostgreSQL will try to establish a streaming connection.
  4. It will stream if data exists.

You can keep streaming as long as necessary. If you want to turn the slave into a master, you can again use pg_ctl promote or the trigger_file file defined in recovery.conf.

Error scenarios

The most important advantage of a dual strategy is that you can create a cluster that offers a higher level of security than just plain streaming-based or plain file-based replay. If streaming does not work for some reason, you can always fall back on the files.

In this section, we will discuss some typical error scenarios in a dual strategy cluster.

Network connection between the master and slave is dead

If the network is dead, the master might not be able to perform the archive_command operation successfully anymore. The history of the XLOG files must remain continuous, so the master has to queue up those XLOG files for later archiving. This can be a dangerous (yet necessary) scenario, because you might run out of space for XLOG on the master if the stream of files is interrupted permanently.

If the streaming connection fails, PostgreSQL will try to keep syncing itself through the file-based channel. Should the file-based channel also fail, the slave will sit there and wait for the network connection to come back. It will then try to fetch XLOG and simply continue once this is possible again.

Tip

Keep in mind that the slave needs an uninterrupted stream of XLOG. It can continue to replay XLOG only if no single XLOG file is missing, or if the streaming connection can still provide the slave with the XLOG that it needs to operate.

Rebooting the slave

Rebooting the slave will not do any harm as long as the archive has XLOG required to bring the slave back up. The slave will simply start up again and try to get the XLOG from any source available. There won't be corruption or any other problem of this sort.

Rebooting the master

If the master reboots, the situation is pretty non-critical as well. The slave will notice through the streaming connection that the master is gone. It will try to fetch the XLOG through both channels, but it won't be successful until the master is back. Again, nothing bad, such as corruption, can happen. Operations can simply resume after the box reboots.

Corrupted XLOG in the archive

If the XLOG in the archive is corrupted, we have to distinguish between two scenarios:

  1. In the first scenario, the slave is streaming. If the stream is okay and intact, the slave will not notice that some XLOG file somehow became corrupted in the archive. The slaves never need to read from the XLOG files as long as the streaming connection is operational.
  2. If we are not streaming but replaying from a file, PostgreSQL will inspect every XLOG record and see whether its checksum is correct. If anything goes wrong, the slave will not continue to replay the corrupted XLOG. This will ensure that no problems can propagate and no broken XLOG can be replayed. Your database might not be complete, but it will be the same and consistent up to the point of the erroneous XLOG file.

Surely, there is a lot more that can go wrong, but given those likely cases, you can see clearly that the design has been made as reliable as possible.

主站蜘蛛池模板: 习水县| 常熟市| 虞城县| 呼和浩特市| 镇赉县| 吴江市| 辽中县| 汝阳县| 苍山县| 互助| 郑州市| 普兰店市| 库车县| 合水县| 肥城市| 那坡县| 扶沟县| 景德镇市| 青河县| 鹰潭市| 清水河县| 北流市| 新化县| 斗六市| 奇台县| 昌图县| 南城县| 茌平县| 章丘市| 北京市| 谷城县| 德格县| 博白县| 科技| 茌平县| 江阴市| 龙江县| 榆中县| 泊头市| 长海县| 连平县|