- PostgreSQL Replication(Second Edition)
- Hans Jürgen Sch?nig
- 691字
- 2021-07-16 13:33:50
Understanding the purpose of PITR
PostgreSQL offers a tool called pg_dump
to back up a database. Basically, pg_dump
will connect to the database, read all of the data in "repeatable read" transaction isolation level and return the data as text. As we are using "repeatable read," the dump is always consistent. So, if your pg_dump
routine starts at midnight and finishes at 6 A.M., you will have created a backup that contains all of the data as of midnight, but no further data. This kind of snapshot creation is highly convenient and perfectly feasible for small to medium amounts of data.
Tip
A dump is always consistent. This means that all the foreign keys are intact. New data added after starting the dump will be missing. It is the most common way to perform standard backups.
But what if your data is so valuable and, maybe, so large in size that you want to back it up incrementally? Taking a snapshot from time to time might be enough for some applications, but for highly critical data, it is clearly not. In addition to that, replaying 20 TB of data in textual form is not efficient either. Point-in-time Recovery has been designed to address this problem. How does it work? Based on a snapshot of the database, the XLOG will be replayed later on. This can happen indefinitely or up to a point chosen by you. In this way, you can reach any point in time.
This method opens the door to many different approaches and features:
- Restoring a database instance up to a given point in time
- Creating a standby database that holds a copy of the original data
- Creating a history of all changes
In this chapter, we will specifically focus on the incremental backup functionality and describe how you can make your data more secure by incrementally archiving changes to a medium of choice.
Moving to the bigger picture
The following diagram provides an overview of the general architecture in use for Point-in-time Recovery:

We have seen in the previous chapter that PostgreSQL produces 16 MB segments of transaction log. Every time one of those segments is filled up and ready, PostgreSQL will call the so-called archive_command
. The aim of archive_command
is to transport the XLOG file from the database instance to an archive. In our diagram, the archive is represented as the pot in the bottom-right corner.
The beauty of the design is that you can basically use an arbitrary shell script to archive the transaction log. Here are some ideas:
- Use some simple copy to transport data to an NFS share
- Run
rsync
to move a file - Use a custom-made script to checksum the XLOG file and move it to an FTP server
- Copy the XLOG file to a tape
- Upload data to a cloud-based storage provider
The possible options for managing XLOG are only limited by the imagination.
The restore_command
is the exact counterpart of archive_command
. Its purpose is to fetch data from the archive and provide it to the instance, which is supposed to replay it (in the preceding diagram, this is labeled as Restore Backup). As you have seen, replay might be used for replication or simply to restore a database to a given point in time, as outlined in this chapter. Again, restore_command
is simply a shell script doing whatever you wish file by file.
Tip
It is important to mention that you, the almighty administrator, are in charge of the archive. You have to decide how much XLOG to keep and when to delete it. The importance of this task cannot be underestimated.
Keep in mind that when archive_command
fails for some reason, PostgreSQL will keep the XLOG file and retry after a couple of seconds. If archiving fails constantly from a certain point onwards, it might happen that the master fills up. The sequence of XLOG files must not be interrupted; if a single file is missing, you cannot continue to replay XLOG. All XLOG files must be present because PostgreSQL needs an uninterrupted sequence of XLOG files. Even if a single file is missing, the recovery process will stop there.
- Learn to Create WordPress Themes by Building 5 Projects
- PHP 從入門到項(xiàng)目實(shí)踐(超值版)
- Java編程指南:基礎(chǔ)知識(shí)、類庫應(yīng)用及案例設(shè)計(jì)
- 數(shù)據(jù)庫系統(tǒng)原理及MySQL應(yīng)用教程
- Java虛擬機(jī)字節(jié)碼:從入門到實(shí)戰(zhàn)
- 微信小程序入門指南
- JavaScript動(dòng)態(tài)網(wǎng)頁編程
- Android移動(dòng)應(yīng)用項(xiàng)目化教程
- 分布式數(shù)據(jù)庫HBase案例教程
- 零基礎(chǔ)學(xué)編程系列(全5冊(cè))
- 一覽眾山小:ASP.NET Web開發(fā)修行實(shí)錄
- 前端架構(gòu)設(shè)計(jì)
- Getting Started with RethinkDB
- 基于Docker的Redis入門與實(shí)戰(zhàn)
- 深入淺出Go語言核心編程