官术网_书友最值得收藏!

Scaling consistency - the master-slave model

As distributed systems have become more commonplace, the need for higher capacity distributed databases has grown. Many distributed databases still attempt to maintain ACID guarantees (or in some cases only the consistency aspect, which is the most difficult in a distributed environment), leading to the master-slave architecture.

In this approach, there might be many servers handling requests, but only one server can actually perform writes so as to maintain data in a consistent state. This avoids the scenario where the same data can be modified via concurrent mutation requests to different nodes. The following diagram shows the most basic scenario:

However, we still have not solved the availability problem, as a failure of the write master would lead to application downtime. It also means that writes do not scale well, since they are all directed to a single machine.

Using sharding to scale writes

A variation on the master-slave approach that enables higher write volumes is a technique called sharding, in which the data is partitioned into groups of keys, such that one or more masters can own a known set of keys. For example, a database of user profiles can be partitioned by the last name, such that A-M belongs to one cluster and N-Z belongs to another, as follows:

An astute observer will notice that both master-slave and sharding introduce failure points on the master nodes, and in fact the sharding approach introduces multiple points of failure–one for each master! Additionally, the knowledge of where requests for certain keys go rests with the application layer, and adding shards requires manual shuffling of data to accommodate the modified key ranges.

Some systems employ shard managers as a layer of abstraction between the application and the physical shards. This has the effect of removing the requirement that the application must have knowledge of the partition map. However, it does not obviate the need for shuffling data as the cluster grows.

Handling the death of the leader

A common means of increasing availability in the event of a failure on a master node is to employ a master failover protocol. The particular semantics of the protocol vary among implementations, but the general principle is that a new master is appointed when the previous one fails. Not all failover algorithms are equal; however, in general, this feature increases availability in a master-slave system.

Even a master-slave database that employs leader election suffers from a number of undesirable traits:

  • Applications must understand the database topology
  • Data partitions must be carefully planned
  • Writes are difficult to scale
  • A failover dramatically increases the complexity of the system in general, and especially so for multisite databases
  • Adding capacity requires reshuffling data with a potential for downtime

Considering that our objective is a highly available system, and presuming that scalability is a concern, are there other options we need to consider?

主站蜘蛛池模板: 澎湖县| 旅游| 大邑县| 云浮市| 旺苍县| 黄龙县| 天台县| 阿荣旗| 黄大仙区| 淳安县| 汪清县| 大安市| 宝应县| 教育| 和静县| 云龙县| 遂平县| 嘉荫县| 饶阳县| 蒲城县| 郴州市| 台北县| 射阳县| 余庆县| 东港市| 济阳县| 华阴市| 玛纳斯县| 三明市| 拉萨市| 海宁市| 溆浦县| 吉木乃县| 福建省| 佛坪县| 大方县| 古浪县| 侯马市| 珲春市| 庐江县| 丹东市|