官术网_书友最值得收藏!

Logical flow of processing

In the current era of data explosion and requirement for an always-connected paradigm, organizations are collecting colossal volumes of data on a continuous basis in real or near-real-time basis. The value of this data surge depends on the ability to extract actionable and contextual insights in a timely fashion. Streaming applications have a very strong mandate to derive real-time actionable insights from massive data ingestion pipelines. They have to react to data in real time. For instance, as a data stream arrives, it should trigger a multitude of dependent actions and capture reactions. The most critical part of building streaming solutions is to understand the interlude between input, output and query processing at scale. Do also note streaming applications never exist in siloed mode, but part of the larger ecosystem of applications.

The following illustration provides a high-level conceptual view of various interplay with different components.  Starting with a stream of data, reference data included to enrich the arriving streaming data,  queries are executed and responses pushed out, followed by notifications to end users and storage of the final results in the data store for future references: 

Logical view of streaming flow processing 

If you take a traditional transactional data processing workload, all the data is collected before the start of processing. In the stream, processing queries are run against that data in flight as illustrated as follows:

 

Queries executed on streaming data

When data is continually in motion keeping the state of the data is challenging or difficult, the state is stored in in-memory that is working memory and that is limited. Additionally, networking challenges will creep in turn resulting late arrival of data or missing data sets.  Patterns like Command Query Responsibility Segregation is used to scale out read and writes separately. 

Command Query Responsibility Segregation (CQRS) is an architecture pattern for separating concerns, the reads and writes are separated and provides the ability to read and write faster in separate streams.  The event stored in the event store is immutable with a timestamp.

In the preceding architecture, immutable events with data stamp are sent through the event pipe and split between immediate event action and long-term data retention.  Events are stored with a timestamp and it gives the ability to determine the state of the system at any previous point in time by querying the Events. By splitting the data streams into multiple channels higher throughput is achieved.

主站蜘蛛池模板: 晴隆县| 宜城市| 新丰县| 射阳县| 光泽县| 汝城县| 遂溪县| 南乐县| 乳山市| 左权县| 高州市| 乌鲁木齐市| 娱乐| 英山县| 奎屯市| 桂平市| 宣恩县| 景洪市| 宁波市| 永定县| 广德县| 铁岭市| 湖南省| 寻乌县| 辰溪县| 金乡县| 勐海县| 南和县| 疏勒县| 两当县| 天门市| 康定县| 昌平区| 禄劝| 安阳市| 福海县| 通渭县| 高清| 莱芜市| 康乐县| 太保市|