官术网_书友最值得收藏!

Kafka origins

Most of you must have used the LinkedIn portal in your professional career. The Kafka system was first built by the LinkedIn technical team. LinkedIn constructed a software metrics collecting system using custom in-house components with some support from existing open source tools. The system was used to collect user activity data on their portal. They use this activity data to show relevant information to each respective user on their web portal. The system was originally built as a traditional XML-based logging service, which was later processed using different Extract Transform Load (ETL) tools. However, this arrangement did not work well for a long time. They started running into various problems. To solve these problems, they built a system called Kafka.

LinkedIn built Kafka as a distributed, fault-tolerant, publish/subscribe system. It records messages organized into topics. Applications can produce or consume messages from topics. All messages are stored as logs to persistent filesystems. Kafka is a write-ahead logging (WAL) system that writes all published messages to log files before making it available for consumer applications. Subscribers/consumers can read these written messages as required in an appropriate time-frame. Kafka was built with the following goals in mind:

  • Loose coupling between message Producers and message Consumers
  • Persistence of message data to support a variety of data consumption scenarios and failure handling
  • Maximum end-to-end throughput with low latency components
  • Managing diverse data formats and types using binary data formats
  • Scaling servers linearly without affecting the existing cluster setup
While we will introduce Kafka in more detail in up coming sections, you should understand that one of the common uses of Kafka is in its stream processing architecture. With its reliable message delivery semantics, it helps in consuming high rates of events. Moreover, it provides message replaying capabilities along with support for different types of consumer.

This further helps in making streaming architecture fault-tolerant and supports a variety of alerting and notification services.

主站蜘蛛池模板: 临猗县| 康平县| 枣庄市| 开阳县| 诏安县| 宁蒗| 永城市| 阿巴嘎旗| 碌曲县| 苍溪县| 江都市| 甘肃省| 阳朔县| 武乡县| 临沧市| 房山区| 临武县| 喀什市| 济阳县| 长泰县| 平塘县| 伊吾县| 高青县| 赤壁市| 柯坪县| 尼勒克县| 临朐县| 泗阳县| 赣榆县| 桓台县| 庆安县| 武宁县| 延寿县| 兴安盟| 桃江县| 林州市| 泉州市| 宣武区| 左贡县| 天门市| 通河县|