官术网_书友最值得收藏!

Understanding emerging cloud-based application architectures

In this section, we will describe common architecture patterns and deployment of some of the main processing models being used for batch processing, streaming applications, and machine learning pipelines. The underlying architecture for these processing models are required to support ingesting very large volumes of various types of data arriving at high velocities at one end, while making the output data available for use by analytical tools, reporting and modeling software, at the other.

The software platforms supporting such applications have the necessary features and support the key mechanisms required to access data across a diverse set of data sources and formats, and prepare it for downstream applications, either as low-latency streaming data or high-throughput historical data stores. For example, Apache Spark is an emerging platform that leverages distributed storage and processing frameworks to support querying, reporting, analytics and intelligent applications at scale.

For more details on Apache Spark-based architectures, refer to Learning Spark SQLAurobindo Sarkar, Packt Publishing.

The following figure shows a high-level architecture that incorporates these requirements in typical Spark-based batch and streaming applications:

主站蜘蛛池模板: 竹北市| 邢台县| 墨竹工卡县| 琼中| 隆化县| 泰来县| 堆龙德庆县| 梁平县| 祁连县| 洪江市| 本溪市| 浪卡子县| 吉林市| 梁平县| 曲松县| 诸城市| 长阳| 滁州市| 洛浦县| 凤台县| 冕宁县| 土默特右旗| 舞钢市| 平原县| 靖西县| 容城县| 兰西县| 临泉县| 大足县| 鄯善县| 沙雅县| 轮台县| 大城县| 昔阳县| 孙吴县| 西丰县| 金寨县| 扬州市| 清涧县| 凤山市| 遵义市|