官术网_书友最值得收藏!

The RAID reliability model is no longer promising

RAID can be configured with a variety of different types; the most common types are RAID5 and RAID6, which can survive the failure of one and two disks, respectively. RAID cannot ensure data reliability after a two-disk failure. This is one of the biggest drawbacks of RAID systems.

Moreover, at the time of a RAID rebuild operation, client requests are most likely to starve for I/O until the rebuild completes. Another limiting factor with RAID is that it only protects against disk failure; it cannot protect against a failure of the network, server hardware, OS, power, or other data center disasters.

After discussing RAID's drawbacks, we can come to the conclusion that we now need a system that can overcome all these drawbacks in a performance and cost-effective way. The Ceph storage system is one of the best solutions available today to address these problems. Let's see how.

For reliability, Ceph makes use of the data replication method, which means it does not use RAID, thus overcoming all the problems that can be found in a RAID-based enterprise system. Ceph is a software-defined storage, so we do not require any specialized hardware for data replication; moreover, the replication level is highly customized by means of commands, which means that the Ceph storage administrator can manage the replication factor of a minimum of one and a maximum of a higher number, totally depending on the underlying infrastructure.

In an event of one or more disk failures, Ceph's replication is a better process than RAID. When a disk drive fails, all the data that was residing on that disk at that point of time starts recovering from its peer disks. Since Ceph is a distributed system, all the data copies are scattered on the entire cluster of disks in the form of objects, such that no two object's copies should reside on the same disk and must reside in a different failure zone defined by the CRUSH map. The good part is that all the cluster disks participate in data recovery. This makes the recovery operation amazingly fast with the least performance problems. Furthermore, the recovery operation does not require any spare disks; the data is simply replicated to other Ceph disks in the cluster. Ceph uses a weighting mechanism for its disks, so different disk sizes is not a problem.

In addition to the replication method, Ceph also supports another advanced way of data reliability: using the erasure-coding technique. Erasure-coded pools require less storage space compared to replicated pools. In erasure-coding, data is recovered or regenerated algorithmically by erasure code calculation. You can use both the techniques of data availability, that is, replication as well as erasure-coding, in the same Ceph cluster but over different storage pools. We will learn more about the erasure-coding technique in the upcoming chapters.

主站蜘蛛池模板: 榕江县| 武穴市| 阳春市| 台州市| 肥西县| 洱源县| 环江| 沁阳市| 广宗县| 舞钢市| 噶尔县| 呼伦贝尔市| 伊通| 远安县| 重庆市| 昭觉县| 即墨市| 阜新市| 五峰| 抚顺县| 宿松县| 九龙坡区| 南澳县| 威远县| 滨海县| 山东省| 秀山| 长垣县| 巫山县| 化德县| 新乡市| 台中县| 监利县| 申扎县| 乐安县| 卢氏县| 历史| 钦州市| 巴青县| 武清区| 泉州市|