官术网_书友最值得收藏!

Logical flow of processing

In the current era of data explosion and requirement for an always-connected paradigm, organizations are collecting colossal volumes of data on a continuous basis in real or near-real-time basis. The value of this data surge depends on the ability to extract actionable and contextual insights in a timely fashion. Streaming applications have a very strong mandate to derive real-time actionable insights from massive data ingestion pipelines. They have to react to data in real time. For instance, as a data stream arrives, it should trigger a multitude of dependent actions and capture reactions. The most critical part of building streaming solutions is to understand the interlude between input, output and query processing at scale. Do also note streaming applications never exist in siloed mode, but part of the larger ecosystem of applications.

The following illustration provides a high-level conceptual view of various interplay with different components.  Starting with a stream of data, reference data included to enrich the arriving streaming data,  queries are executed and responses pushed out, followed by notifications to end users and storage of the final results in the data store for future references: 

Logical view of streaming flow processing 

If you take a traditional transactional data processing workload, all the data is collected before the start of processing. In the stream, processing queries are run against that data in flight as illustrated as follows:

 

Queries executed on streaming data

When data is continually in motion keeping the state of the data is challenging or difficult, the state is stored in in-memory that is working memory and that is limited. Additionally, networking challenges will creep in turn resulting late arrival of data or missing data sets.  Patterns like Command Query Responsibility Segregation is used to scale out read and writes separately. 

Command Query Responsibility Segregation (CQRS) is an architecture pattern for separating concerns, the reads and writes are separated and provides the ability to read and write faster in separate streams.  The event stored in the event store is immutable with a timestamp.

In the preceding architecture, immutable events with data stamp are sent through the event pipe and split between immediate event action and long-term data retention.  Events are stored with a timestamp and it gives the ability to determine the state of the system at any previous point in time by querying the Events. By splitting the data streams into multiple channels higher throughput is achieved.

主站蜘蛛池模板: 永德县| 增城市| 石狮市| 南部县| 华阴市| 通化县| 曲阳县| 雅安市| 酉阳| 通山县| 健康| 贵港市| 当涂县| 陇南市| 青海省| 平乐县| 古田县| 名山县| 丰宁| 南汇区| 偃师市| 拉萨市| 安泽县| 个旧市| 和静县| 辽阳市| 平湖市| 长泰县| 阜南县| 沛县| 公主岭市| 宁波市| 焉耆| 元阳县| 裕民县| 九寨沟县| 中方县| 龙井市| 德格县| 随州市| 漾濞|