- Learning Ceph(Second Edition)
- Anthony D'Atri Vaibhav Bhembre Karan Singh
- 485字
- 2021-07-08 09:43:49
The next-generation architecture
Traditional storage systems lack an efficient way to managing metadata. Metadata is information (data) about the actual user payload data, including where the data will be written to and read from. Traditional storage systems maintain a central lookup table to track of their metadata. Every time a client sends a request for a read or write operation, the storage system first performs a lookup to the huge metadata table. After receiving the results it performs the client operation. For a smaller storage system, you might not notice the performance impact of this centralized bottleneck, but as storage domains grow large the performance and scalability limits of this approach become increasingly problematic.
Ceph does not follow the traditional storage architecture; it has been totally reinvented for the next generation. Rather than centrally storing, manipulating, and accessing metadata, Ceph introduces a new approach, the Controlled Replication Under Scalable Hashing (CRUSH) algorithm.
http://ceph.com/resources/publications
Instead of performing a lookup in the metadata table for every client request, the CRUSH algorithm enables the client to independently computes where data should be written to or read from. By deriving this metadata dynamically, there is no need to manage a centralized table. Modern computers can perform a CRUSH lookup very quickly; moreover, a smaller computing load can be distributed across cluster nodes, leveraging the power of distributed storage.
CRUSH accomplishes this via infrastructure awareness. It understands the hierarchy and capacities of the various components of your logical and physical infrastructure: drives, nodes, chassis, datacenter racks, pools, network switch domains, datacenter rows, even datacenter rooms and buildings as local requirements dictate. These are the failure domains for any infrastructure. CRUSH stores data safely replicated so that data will be protected (durability) and accessible (availability) even if multiple components fail within or across failure domains. Ceph managers define these failure domains for their infrastructure within the topology of Ceph's CRUSH map. The Ceph backend and clients share a copy of the CRUSH map, and clients are thus able to derive the location, drive, server, datacenter, and so on, of desired data and access it directly without a centralized lookup bottleneck.
CRUSH enables Ceph's self-management and self-healing. In the event of component failure, the CRUSH map is updated to reflect the down component. The back end transparently determines the effect of the failure on the cluster according to defined placement and replication rules. Without administrative intervention, the Ceph back end performs behind-the-scenes recovery to ensure data durability and availability. The back end creates replicas of data from surviving copies on other, unaffected components to restore the desired degree of safety. A properly designed CRUSH map and CRUSH rule set ensure that the cluster will maintain more than one copy of data distributed across the cluster on diverse components, avoiding data loss from single or multiple component failures.
- Java程序設計(慕課版)
- LabVIEW2018中文版 虛擬儀器程序設計自學手冊
- Visual Studio 2012 Cookbook
- vSphere High Performance Cookbook
- ASP.NET Core Essentials
- Machine Learning with R Cookbook(Second Edition)
- Learning Linux Binary Analysis
- 64位匯編語言的編程藝術
- 精通API架構:設計、運維與演進
- The DevOps 2.4 Toolkit
- Scala編程實戰(原書第2版)
- Mastering Unity 2D Game Development(Second Edition)
- 搞定J2EE:Struts+Spring+Hibernate整合詳解與典型案例
- Web前端應用開發技術
- Java程序設計實用教程(第2版)