官术网_书友最值得收藏!

Breaking the mould

Hazelcast is a radical new approach to data, designed from the ground up around distribution. It embraces a new scalable way of thinking; in that data should be shared around for both resilience and performance, while allowing us to configure the trade-offs surrounding consistency as the data requirements dictate.

The first major feature to understand about Hazelcast is its masterless nature; each node is configured to be functionally the same. The oldest node in the cluster is the de facto leader and manages the membership, automatically delegating as to which node is responsible for what data. In this way as new nodes join or dropout, the process is repeated and the cluster rebalances accordingly. This makes Hazelcast incredibly simple to get up and running, as the system is self-discovering, self-clustering, and works straight out of the box.

However, the second feature to remember is that we are persisting data entirely in-memory; this makes it incredibly fast but this speed comes at a price. When a node is shutdown, all the data that was held on it is lost. We combat this risk to resilience through replication, by holding enough copies of a piece of data across multiple nodes. In the event of failure, the overall cluster will not suffer any data loss. By default, the standard backup count is 1, so we can immediately enjoy basic resilience. But don't pull the plug on more than one node at a time, until the cluster has reacted to the change in membership and reestablished the appropriate number of backup copies of data.

So when we introduce our new masterless distributed cluster, we get something like the following figure:

Note

A distributed cache is by far the most powerful as it can scale up in response to changes in the application's needs.

We previously identified that multi-node caches tend to suffer from either saturation or consistency issues. In the case of Hazelcast, each node is the owner of a number of partitions of the overall data, so the load will be fairly spread across the cluster. Hence, any saturation would be at the cluster level rather than any individual node. We can address this issue simply by adding more nodes. In terms of consistency, by default the backup copies of the data are internal to Hazelcast and not directly used, as such we enjoy strict consistency. This does mean that we have to interact with a specific node to retrieve or update a particular piece of data; however, exactly which node that is an internal operational detail and can vary over time— we as developers never actually need to know.

If we imagine that our data is split into a number of partitions, that each partition slice is owned by one node and backed up on another, we could then visualize the interactions like the following figure:

This means that for data belonging to Partition 1, our application will have to communicate to Node 1, Node 2 for data belonging to Partition 2, and so on. The slicing of the data into each partition is dynamic; so in practice, where there are more partitions than nodes, each node will own a number of different partitions and hold backups for others. As we have mentioned before, all of this is an internal operational detail, and our application does not need to know it, but it is important that we understand what is going on behind the scenes.

主站蜘蛛池模板: 长白| 鄂州市| 三亚市| 奉新县| 舞钢市| 曲周县| 武冈市| 宝鸡市| 台前县| 稷山县| 曲阜市| 武义县| 舞阳县| 汝城县| 华宁县| 莫力| 河间市| 莒南县| 娄烦县| 乌拉特后旗| 彭泽县| 昭觉县| 紫云| 巴里| 富顺县| 玉门市| 房产| 永仁县| 南木林县| 永宁县| 余庆县| 宝丰县| 靖宇县| 安徽省| 临猗县| 咸阳市| 泰宁县| 金昌市| 衡阳市| 磴口县| 五华县|