- Mastering Ceph
- Nick Fisk
- 322字
- 2021-07-09 19:55:10
Failure domains
If your cluster will have less than 10 nodes, this is probably the most important point.
With legacy scale-up storage, the hardware is expected to be 100% reliable. All components are redundant, and the failure of a complete component such as a system board or disk JBOD would likely cause an outage. Therefore, there is no real knowledge of how such a failure might impact the operation of the system, just the hope that it doesn't happen! With Ceph, there is an underlying assumption that complete failure of a section of your infrastructure, be that a disk, node, or even rack should be considered as normal and should not make your cluster unavailable.
Let's take two Ceph clusters both comprising 240 disks. Cluster A comprises 20x12 disk nodes; Cluster B comprises 4x60 disk nodes. Now, let's take a scenario where for whatever reason a Ceph OSD node goes offline. It could be due to planned maintenance or unexpected failure, but that node is now down and any data on it is unavailable. Ceph is designed to mask this situation and will even recover from it whilst maintaining full data access.
In the case of cluster A, we have now lost 5% of our disks and in the event of a permanent loss would have to reconstruct 72 TB of data. Cluster B has lost 25% of its disks and would have to reconstruct 360 TB. The latter would severely impact the performance of the cluster, and in the case of data reconstruction, this period of degraded performance could last for many days.
It's clear that on smaller sized clusters, these very large dense nodes are not a good idea. A 10 Ceph node cluster is probably the minimum size if you want to reduce the impact of node failure, and so in the case of 60 drive JBODs, you would need a cluster that at minimum is measured in petabytes.
- Mastering Mesos
- Hands-On Internet of Things with MQTT
- Linux Mint System Administrator’s Beginner's Guide
- Visual FoxPro 6.0數據庫與程序設計
- 手把手教你玩轉RPA:基于UiPath和Blue Prism
- 模型制作
- 計算機網絡安全
- 精通數據科學算法
- Machine Learning with the Elastic Stack
- 機器人人工智能
- 網絡服務器搭建與管理
- Web璀璨:Silverlight應用技術完全指南
- Linux Shell Scripting Cookbook(Third Edition)
- 工業機器人集成應用
- 算法設計與分析