- Stream Analytics with Microsoft Azure
- Anindita Basak Krishna Venkataraman Ryan Murphy Manpreet Singh
- 416字
- 2021-07-02 22:35:58
Logical flow of processing
In the current era of data explosion and requirement for an always-connected paradigm, organizations are collecting colossal volumes of data on a continuous basis in real or near-real-time basis. The value of this data surge depends on the ability to extract actionable and contextual insights in a timely fashion. Streaming applications have a very strong mandate to derive real-time actionable insights from massive data ingestion pipelines. They have to react to data in real time. For instance, as a data stream arrives, it should trigger a multitude of dependent actions and capture reactions. The most critical part of building streaming solutions is to understand the interlude between input, output and query processing at scale. Do also note streaming applications never exist in siloed mode, but part of the larger ecosystem of applications.
The following illustration provides a high-level conceptual view of various interplay with different components. Starting with a stream of data, reference data included to enrich the arriving streaming data, queries are executed and responses pushed out, followed by notifications to end users and storage of the final results in the data store for future references:

Logical view of streaming flow processing
If you take a traditional transactional data processing workload, all the data is collected before the start of processing. In the stream, processing queries are run against that data in flight as illustrated as follows:
Queries executed on streaming data
When data is continually in motion keeping the state of the data is challenging or difficult, the state is stored in in-memory that is working memory and that is limited. Additionally, networking challenges will creep in turn resulting late arrival of data or missing data sets. Patterns like Command Query Responsibility Segregation is used to scale out read and writes separately.
Command Query Responsibility Segregation (CQRS) is an architecture pattern for separating concerns, the reads and writes are separated and provides the ability to read and write faster in separate streams. The event stored in the event store is immutable with a timestamp.

In the preceding architecture, immutable events with data stamp are sent through the event pipe and split between immediate event action and long-term data retention. Events are stored with a timestamp and it gives the ability to determine the state of the system at any previous point in time by querying the Events. By splitting the data streams into multiple channels higher throughput is achieved.
- Clojure Data Analysis Cookbook
- 玩轉(zhuǎn)智能機(jī)器人程小奔
- 數(shù)據(jù)運(yùn)營之路:掘金數(shù)據(jù)化時(shí)代
- 計(jì)算機(jī)網(wǎng)絡(luò)應(yīng)用基礎(chǔ)
- Maya 2012從入門到精通
- Pig Design Patterns
- 工業(yè)機(jī)器人運(yùn)動(dòng)仿真編程實(shí)踐:基于Android和OpenGL
- 計(jì)算機(jī)與信息技術(shù)基礎(chǔ)上機(jī)指導(dǎo)
- 網(wǎng)絡(luò)服務(wù)搭建、配置與管理大全(Linux版)
- Working with Linux:Quick Hacks for the Command Line
- 重估:人工智能與賦能社會(huì)
- 簡明學(xué)中文版Photoshop
- Mastering Predictive Analytics with scikit:learn and TensorFlow
- Cortex-M3嵌入式處理器原理與應(yīng)用
- 軟測(cè)之魂