- Learning Apache Apex
- Thomas Weise Munagala V. Ramanath David Yan Kenneth Knowles
- 305字
- 2021-07-02 22:38:34
Unbounded data and continuous processing
Datasets can be classified as unbounded or bounded. Bounded data is finite; it has a beginning and an end. Unbounded data is an ever-growing, essentially infinite data set. The distinction is independent of how the data is processed. Often, unbounded data is equated to stream processing and bounded data to batch processing, but this is starting to change. We will see how state-of-the-art stream processors, such as Apache Apex, can be used to (and are very capable of) processing both unbounded and bounded data, and there is no need for a batch processing system just because the data set happens to be finite.
For more details on these data processing concepts, you can visit the following link: https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101.
Most big datasets (high volume) that are eventually processed by big data systems are unbounded. There is a rapidly increasing volume of such infinite data from sources such as IoT sensors (such as industrial gauge sensors, automobile data ports, connected home, and quantified self), stock markets and financial transactions, telecommunications towers and satellites, and so on. At the same time, the legacy processing and storage systems are either nearing performance and capacity limits, or total cost of ownership (TCO) is becoming prohibitive.
Businesses need to convert the available data into meaningful insights and make data-driven, real-time decisions to remain competitive.
Organizations are increasingly relying on very fast processing (high velocity), as the value of data diminishes as it ages:

How were these unbounded datasets processed without streaming architecture?
To be consumable by a batch processor, they had to be pided into bounded data, often at intervals of hours. Before processing could begin, the earliest events would wait for a long time for their batch to be ready. At the time of processing, data would already be old and less valuable.
- Mastercam 2017數(shù)控加工自動(dòng)編程經(jīng)典實(shí)例(第4版)
- 腦動(dòng)力:PHP函數(shù)速查效率手冊(cè)
- 21天學(xué)通C++
- 工業(yè)機(jī)器人入門(mén)實(shí)用教程(KUKA機(jī)器人)
- 工業(yè)機(jī)器人現(xiàn)場(chǎng)編程(FANUC)
- 大數(shù)據(jù)安全與隱私保護(hù)
- 工業(yè)控制系統(tǒng)測(cè)試與評(píng)價(jià)技術(shù)
- Linux服務(wù)與安全管理
- 工業(yè)機(jī)器人安裝與調(diào)試
- PVCBOT機(jī)器人控制技術(shù)入門(mén)
- Unity Multiplayer Games
- 手機(jī)游戲策劃設(shè)計(jì)
- WOW!Photoshop CS6完全自學(xué)寶典
- Hands-On SAS for Data Analysis
- 工業(yè)機(jī)器人入門(mén)實(shí)用教程