- Building Data Streaming Applications with Apache Kafka
- Manish Kumar Chanchal Singh
- 703字
- 2022-07-12 10:38:12
Kafka's architecture
This section introduces you to Kafka architecture. By the end of this section, you will have a clear understanding of both the logical and physical architecture of Kafka. Let's see how Kafka components are organized logically.
Every message in Kafka topics is a collection of bytes. This collection is represented as an array. Producers are the applications that store information in Kafka queues. They send messages to Kafka topics that can store all types of messages. Every topic is further differentiated into partitions. Each partition stores messages in the sequence in which they arrive. There are two major operations that producers/consumers can perform in Kafka. Producers append to the end of the write-ahead log files. Consumers fetch messages from these log files belonging to a given topic partition. Physically, each topic is spread over different Kafka brokers, which host one or two partitions of each topic.
Ideally, Kafka pipelines should have a uniform number of partitions per broker and all topics on each machine. Consumers are applications or processes that subscribe to a topic or receive messages from these topics.
The following diagram shows you the conceptual layout of a Kafka cluster:

The preceding paragraphs explain the logical architecture of Kafka and how different logical components coherently work together. While it is important to understand how Kafka architecture is divided logically, you also need to understand what Kafka's physical architecture looks like. This will help you in later chapters as well. A Kafka cluster is basically composed of one or more servers (nodes). The following diagram depicts how a multi-node Kafka cluster looks:

A typical Kafka cluster consists of multiple brokers. It helps in load-balancing message reads and writes to the cluster. Each of these brokers is stateless. However, they use Zookeeper to maintain their states. Each topic partition has one of the brokers as a leader and zero or more brokers as followers. The leaders manage any read or write requests for their respective partitions. Followers replicate the leader in the background without actively interfering with the leader's working. You should think of followers as a backup for the leader and one of those followers will be chosen as the leader in the case of leader failure.
Zookeeper is an important component of a Kafka cluster. It manages and coordinates Kafka brokers and consumers. Zookeeper keeps track of any new broker additions or any existing broker failures in the Kafka cluster. Accordingly, it will notify the producer or consumers of Kafka queues about the cluster state. This helps both producers and consumers in coordinating work with active brokers. Zookeeper also records which broker is the leader for which topic partition and passes on this information to the producer or consumer to read and write the messages.
At this juncture, you must be familiar with producer and consumer applications with respect to the Kafka cluster. However, it is beneficial to touch on these briefly so that you can verify your understanding. Producers push data to brokers. At the time of publishing data, producers search for the elected leader (broker) of the respective topic partition and automatically send a message to that leader broker server. Similarly, the consumer reads messages from brokers.
The consumer records its state with the help of Zookeeper as Kafka brokers are stateless. This design helps in scaling Kafka well. The consumer offset value is maintained by Zookeeper. The consumer records how many messages have been consumed by it using partition offset. It ultimately acknowledges that message offset to Zookeeper. It means that the consumer has consumed all prior messages.
- JavaScript從入門到精通(微視頻精編版)
- Mastering AWS Lambda
- Kubernetes實戰(zhàn)
- 潮流:UI設計必修課
- SoapUI Cookbook
- Building Mobile Applications Using Kendo UI Mobile and ASP.NET Web API
- Implementing Cisco Networking Solutions
- MySQL數(shù)據(jù)庫基礎實例教程(微課版)
- Kotlin極簡教程
- Training Systems Using Python Statistical Modeling
- Flask Web開發(fā):基于Python的Web應用開發(fā)實戰(zhàn)(第2版)
- .NET 4.0面向?qū)ο缶幊搪劊簯闷?/a>
- DB2SQL性能調(diào)優(yōu)秘笈
- 數(shù)據(jù)結構:Python語言描述
- 例解Python:Python編程快速入門踐行指南