官术网_书友最值得收藏!

Kafka's architecture

This section introduces you to Kafka architecture. By the end of this section, you will have a clear understanding of both the logical and physical architecture of Kafka. Let's see how Kafka components are organized logically.

Every message in Kafka topics is a collection of bytes. This collection is represented as an array. Producers are the applications that store information in Kafka queues. They send messages to Kafka topics that can store all types of messages. Every topic is further differentiated into partitions. Each partition stores messages in the sequence in which they arrive. There are two major operations that producers/consumers can perform in Kafka. Producers append to the end of the write-ahead log files. Consumers fetch messages from these log files belonging to a given topic partition. Physically, each topic is spread over different Kafka brokers, which host one or two partitions of each topic.

Ideally, Kafka pipelines should have a uniform number of partitions per broker and all topics on each machine. Consumers are applications or processes that subscribe to a topic or receive messages from these topics.

The following diagram shows you the conceptual layout of a Kafka cluster:

Kafka's logical architecture

The preceding paragraphs explain the logical architecture of Kafka and how different logical components coherently work together. While it is important to understand how Kafka architecture is divided logically, you also need to understand what Kafka's physical architecture looks like. This will help you in later chapters as well. A Kafka cluster is basically composed of one or more servers (nodes). The following diagram depicts how a multi-node Kafka cluster looks:

Kafka's physical architecture

A typical Kafka cluster consists of multiple brokers. It helps in load-balancing message reads and writes to the cluster. Each of these brokers is stateless. However, they use Zookeeper to maintain their states. Each topic partition has one of the brokers as a leader and zero or more brokers as followers. The leaders manage any read or write requests for their respective partitions. Followers replicate the leader in the background without actively interfering with the leader's working. You should think of followers as a backup for the leader and one of those followers will be chosen as the leader in the case of leader failure.

Each server in a Kafka cluster will either be a leader for some of the topic's partitions or a follower for others. In this way, the load on every server is equally balanced. Kafka broker leader election is done with the help of Zookeeper.

Zookeeper is an important component of a Kafka cluster. It manages and coordinates Kafka brokers and consumers. Zookeeper keeps track of any new broker additions or any existing broker failures in the Kafka cluster. Accordingly, it will notify the producer or consumers of Kafka queues about the cluster state. This helps both producers and consumers in coordinating work with active brokers. Zookeeper also records which broker is the leader for which topic partition and passes on this information to the producer or consumer to read and write the messages.

At this juncture, you must be familiar with producer and consumer applications with respect to the Kafka cluster. However, it is beneficial to touch on these briefly so that you can verify your understanding. Producers push data to brokers. At the time of publishing data, producers search for the elected leader (broker) of the respective topic partition and automatically send a message to that leader broker server. Similarly, the consumer reads messages from brokers.

The consumer records its state with the help of Zookeeper as Kafka brokers are stateless. This design helps in scaling Kafka well. The consumer offset value is maintained by Zookeeper. The consumer records how many messages have been consumed by it using partition offset. It ultimately acknowledges that message offset to Zookeeper. It means that the consumer has consumed all prior messages.

This brings us to an end of our section on Kafka architecture. Hopefully, by this time, you are well versed with Kafka architecture and understand all logical and physical components. The next sections cover each of these components in detail. However, it is imperative that you understand the overall Kafka architecture before delving into each of the components.
主站蜘蛛池模板: 龙口市| 木兰县| 昌都县| 信丰县| 囊谦县| 巫溪县| 米易县| 湖南省| 榆树市| 儋州市| 海丰县| 科技| 舒兰市| 上思县| 葫芦岛市| 韶关市| 铜川市| 梅河口市| 巴东县| 金堂县| 定边县| 广宗县| 浦江县| 安溪县| 武宁县| 开封市| 汕头市| 敖汉旗| 高尔夫| 海盐县| 东乌珠穆沁旗| 雷波县| 景泰县| 桃园市| 朝阳区| 射洪县| 卢氏县| 大田县| 牙克石市| 青河县| 青海省|