官术网_书友最值得收藏!

Breaking the mould

Hazelcast is a radical new approach to data, designed from the ground up around distribution. It embraces a new scalable way of thinking; in that data should be shared around for both resilience and performance, while allowing us to configure the trade-offs surrounding consistency as the data requirements dictate.

The first major feature to understand about Hazelcast is its masterless nature; each node is configured to be functionally the same. The oldest node in the cluster is the de facto leader and manages the membership, automatically delegating as to which node is responsible for what data. In this way as new nodes join or dropout, the process is repeated and the cluster rebalances accordingly. This makes Hazelcast incredibly simple to get up and running, as the system is self-discovering, self-clustering, and works straight out of the box.

However, the second feature to remember is that we are persisting data entirely in-memory; this makes it incredibly fast but this speed comes at a price. When a node is shutdown, all the data that was held on it is lost. We combat this risk to resilience through replication, by holding enough copies of a piece of data across multiple nodes. In the event of failure, the overall cluster will not suffer any data loss. By default, the standard backup count is 1, so we can immediately enjoy basic resilience. But don't pull the plug on more than one node at a time, until the cluster has reacted to the change in membership and reestablished the appropriate number of backup copies of data.

So when we introduce our new masterless distributed cluster, we get something like the following figure:

Note

A distributed cache is by far the most powerful as it can scale up in response to changes in the application's needs.

We previously identified that multi-node caches tend to suffer from either saturation or consistency issues. In the case of Hazelcast, each node is the owner of a number of partitions of the overall data, so the load will be fairly spread across the cluster. Hence, any saturation would be at the cluster level rather than any individual node. We can address this issue simply by adding more nodes. In terms of consistency, by default the backup copies of the data are internal to Hazelcast and not directly used, as such we enjoy strict consistency. This does mean that we have to interact with a specific node to retrieve or update a particular piece of data; however, exactly which node that is an internal operational detail and can vary over time— we as developers never actually need to know.

If we imagine that our data is split into a number of partitions, that each partition slice is owned by one node and backed up on another, we could then visualize the interactions like the following figure:

This means that for data belonging to Partition 1, our application will have to communicate to Node 1, Node 2 for data belonging to Partition 2, and so on. The slicing of the data into each partition is dynamic; so in practice, where there are more partitions than nodes, each node will own a number of different partitions and hold backups for others. As we have mentioned before, all of this is an internal operational detail, and our application does not need to know it, but it is important that we understand what is going on behind the scenes.

主站蜘蛛池模板: 于都县| 无棣县| 吴川市| 江西省| 青铜峡市| 湖口县| 金山区| 南充市| 敦化市| 宜兰市| 利辛县| 龙川县| 当雄县| 集安市| 镇远县| 万山特区| 陆河县| 铁力市| 灵宝市| 当雄县| 松桃| 乐亭县| 吴桥县| 昌宁县| 广安市| 封开县| 惠东县| 台东县| 江城| 寿阳县| 息烽县| 九江市| 棋牌| 新闻| 都匀市| 肥东县| 江华| 梅州市| 仪征市| 买车| 连州市|