官术网_书友最值得收藏!

HA and DR Terminologies

The following terms are important in the world of HA and DR so that you can correctly choose the best possible HA and DR solutions and for the better understanding of HA and DR concepts.

Availability

Availability or uptime is defined as the percentage that a system or an application should be available for in a given year. Availability is expressed as Number of Nines.

For example, a 90%, one nine, availability means that a system can tolerate a downtime of 36.5 hours in a year, and a 99.999%, five nines, availability means that a system can tolerate a downtime of 5.26 minutes per year.

The following table, taken from https://en.wikipedia.org/wiki/High_availability, describes the availability percentages and the downtime for each percentage:

Note

This link also talks about how this is calculated. You can look at it, but a discussion on calculation is out of the scope of this book.

Figure 1.4: Availability table

In the preceding table, you can see that as the Number of Nines increases, the downtime decreases. The business decides the availability, the Number of Nines, required for the system. This plays a vital role in selecting the type of HA and DR solution required for any given system. The higher the Number of Nines, the more rigorous or robust the required solution.

Recovery Time Objective

Recovery time objective, or RTO, is essentially the downtime a business can tolerate without any substantial loss. For example, an RTO of one hour means that an application shouldn't be down for more than one hour. A downtime of more than an hour would result in critical financial, reputation, or data loss.

The choice of HA and DR solution depends on the RTO. If an application has a four-hour RTO, you can recover the database using backups (if backups are being done every two hours or so), and you may not need any HA and DR solution. However, if the RTO is 15 minutes, then backups won't work, and an HA and DR solution will be needed.

Recovery Point Objective

Recovery point objective, or RPO, defines how much data loss a business can tolerate during an outage. For example, an RPO of two hours would mean a data loss of two hours won't cost anything to the business; however, if it goes beyond that, it would have significant financial or reputation impacts.

Essentially, this is the time difference between the last transaction committed before downtime and the first transaction committed after recovery.

The choice of HA and DR solution also depends on the RPO. If an application has 24 hours of RPO, daily full backups are good enough; however, for a business with four hours of RPO, daily full backups are not enough.

To differentiate between RTO and RPO, let's consider a scenario. A company has an RTO of one hour and an RPO of four hours. There's no HA and DR solution, and backups are being done every 12 hours.

In the case of an outage, the company was able to restore the database from the last full backup in one hour, which is within the given RTO of one hour; however, they suffered a data loss as the backups are being done every 12 hours and the RPO is of four hours.

主站蜘蛛池模板: 宜兰市| 昂仁县| 天水市| 札达县| 平邑县| 黑山县| 图木舒克市| 泌阳县| 五河县| 古交市| 古浪县| 灌南县| 修文县| 大余县| 昭苏县| 莲花县| 吴桥县| 四川省| 布尔津县| 城口县| 大冶市| 类乌齐县| 石家庄市| 郧西县| 沾益县| 广德县| 遂平县| 邵东县| 庄浪县| 牟定县| 台南市| 清苑县| 绥德县| 江孜县| 苏州市| 大渡口区| 忻城县| 郴州市| 象州县| 遂川县| 芜湖市|