官术网_书友最值得收藏!

Understanding emerging cloud-based application architectures

In this section, we will describe common architecture patterns and deployment of some of the main processing models being used for batch processing, streaming applications, and machine learning pipelines. The underlying architecture for these processing models are required to support ingesting very large volumes of various types of data arriving at high velocities at one end, while making the output data available for use by analytical tools, reporting and modeling software, at the other.

The software platforms supporting such applications have the necessary features and support the key mechanisms required to access data across a diverse set of data sources and formats, and prepare it for downstream applications, either as low-latency streaming data or high-throughput historical data stores. For example, Apache Spark is an emerging platform that leverages distributed storage and processing frameworks to support querying, reporting, analytics and intelligent applications at scale.

For more details on Apache Spark-based architectures, refer to Learning Spark SQLAurobindo Sarkar, Packt Publishing.

The following figure shows a high-level architecture that incorporates these requirements in typical Spark-based batch and streaming applications:

主站蜘蛛池模板: 宜川县| 洛宁县| 柞水县| 齐河县| 鲁甸县| 连云港市| 冷水江市| 德昌县| 墨脱县| 青川县| 新密市| 汤阴县| 乌苏市| 当阳市| 通州市| 运城市| 海盐县| 北川| 潞西市| 鄢陵县| 天祝| 大英县| 临漳县| 和平县| 日喀则市| 伊春市| 昌江| 高碑店市| 康定县| 定远县| 桂林市| 杭锦后旗| 景德镇市| 囊谦县| 清徐县| 肇东市| 西华县| 汨罗市| 曲靖市| 宁蒗| 潞城市|