- PostgreSQL Replication(Second Edition)
- Hans Jürgen Sch?nig
- 801字
- 2021-07-16 13:33:53
Making streaming-only replication more robust
The first thing a slave has to do when connecting to a master is to play catch up. But can this always work? We have already seen that we can use a mixed setup consisting of a streaming-based and a file-based component. This gives us some extra security if streaming does not work.
In many real-world scenarios, two ways of transporting the XLOG might be too complicated. In many cases, it is enough to have just streaming. The point is that in a normal setup, as described already, the master can throw the XLOG away as soon as it is not needed to repair the master anymore. Depending on your checkpoint configuration, the XLOG might be around for quite a while or only a short time. The trouble is that if your slave connects to the master, it might happen that the desired XLOG is not around anymore. The slave cannot resync itself in this scenario. You might find this a little annoying, because it implicitly limits the maximum downtime of your slave to your master's checkpoint behavior.
Two choices are available to solve the problem:
wal_keep_segments
: Keep some XLOG files on the master- Physical replication slots: Teach the master to recycle the XLOG only when it has been consumed
Using wal_keep_segments
To make your setup much more robust, we recommend making heavy use of wal_keep_segments
. The idea of this postgresql.conf
setting (on the master) is to teach the master to keep more XLOG files around than theoretically necessary. If you set this variable to 1000
, it essentially means that the master will keep 16 GB of more XLOG than needed. In other words, your slave can be gone for 16 GB (in terms of changes to the master) longer than usual. This greatly increases the odds that a slave can join the cluster without having to completely resync itself from scratch. For a 500 MB database, this is not worth mentioning, but if your setup has to hold hundreds of gigabytes or terabytes, it becomes an enormous advantage. Producing a base backup of a 20 TB instance is a lengthy process. You might not want to do this too often, and you definitely won't want to do this over and over again.
Tip
If you want to update a large base backup, it might be beneficial to incrementally update it using rsync
and the traditional method of taking base backups.
What are the reasonable values for wal_keep_segments
? As always, this largely depends on your workloads. From experience, we can tell that a multi-GB implicit archive on the master is definitely an investment worth considering. Very low values for wal_keep_segments
might be risky and not worth the effort. Nowadays, pace is usually cheap. Small systems might not need this setting, and large ones should have sufficient spare capacity to absorb the extra requirements. Personally, I am always in favor of using at least some extra XLOG segments.
Utilizing replication slots
With the introduction of PostgreSQL 9.4, a more sophisticated solution to the problem of deleted XLOG has been introduced—physical replication slots. As already outlined earlier in this book, replication slots make sure that the master deletes XLOG only when it has been safely consumed by the replica. In the case of the cleanup problem outlined in this section, this is exactly what is needed here.
The question now is: how can a replication slot be used? Basically, it is very simple. All that has to be done is create the replication slot on the master and tell the slave which slots to use through recovery.conf
.
Here is how it works on the master:
postgres=# SELECT * FROM pg_create_physical_replication_slot('repl_slot); slot_name | xlog_position -------------+--------------- repl_slot | postgres=# SELECT * FROM pg_replication_slots; slot_name | slot_type | datoid | database | active | xmin | restart_lsn -------------+-----------+--------+----------+--------+------+------------- repl_slot | physical | | | f | | (1 row)
Once the base backup has happened, the slave can be configured easily:
standby_mode = 'on' primary_conninfo = 'host=master.postgresql-support.de port=5432 user=hans password=hanspass' primary_slot_name = 'repl_slot'
The configuration is just as if there were no replication slots. The only change is that the primary_slot_name
variable has been added. The slave will pass the name of the replication slot to the master, and the master knows when to recycle the transaction log. As mentioned already, if a slave is not in use anymore, make sure that the replication slot is properly deleted to avoid trouble on the master (running out of disk space and other troubles). The problem is that this is incredibly insidious. Slaves, being optional, are not always monitored as they should be. As such, it might be a good idea to recommend that you regularly compare pg_stat_replication
with pg_replication_slots
for mismatches worthy of further investigation.
- Spring 5.0 Microservices(Second Edition)
- Visual Basic程序開發(學習筆記)
- JavaScript+jQuery網頁特效設計任務驅動教程(第2版)
- Hands-On Image Processing with Python
- Scratch 3.0少兒編程與邏輯思維訓練
- 區塊鏈:以太坊DApp開發實戰
- Linux操作系統基礎案例教程
- Procedural Content Generation for C++ Game Development
- Delphi開發典型模塊大全(修訂版)
- Instant Apache Camel Messaging System
- 精通Spring:Java Web開發與Spring Boot高級功能
- Practical Linux Security Cookbook
- 快速搞定Spring Boot+Vue全棧開發
- 機器學習開發者指南
- KnockoutJS Essentials