- Mastering Machine Learning with Spark 2.x
- Alex Tellez Max Pumperla Michal Malohlava
- 167字
- 2021-07-02 18:46:05
Design of Sparkling Water
Sparkling Water is designed to be executed as a regular Spark application. Consequently, it is launched inside a Spark executor created after submitting the application. At this point, H2O starts services, including a distributed key-value (K/V) store and memory manager, and orchestrates them into a cloud. The topology of the created cloud follows the topology of the underlying Spark cluster.
As stated previously, Sparkling Water enables transformation between different types of RDDs/DataFrames and H2O's frame, and vice versa. When converting from a hex frame to an RDD, a wrapper is created around the hex frame to provide an RDD-like API. In this case, data is not duplicated but served directly from the underlying hex frame. Converting from an RDD/DataFrame to a H2O frame requires data duplication because it transforms data from Spark into H2O-specific storage. However, data stored in an H2O frame is heavily compressed and does not need to be preserved as an RDD anymore:

- iOS Game Programming Cookbook
- Vue.js 2 and Bootstrap 4 Web Development
- 深入淺出RxJS
- PLC編程與調試技術(松下系列)
- 單片機應用與調試項目教程(C語言版)
- 快人一步:系統性能提高之道
- Selenium Testing Tools Cookbook(Second Edition)
- INSTANT Adobe Edge Inspect Starter
- 嵌入式Linux C語言程序設計基礎教程
- Flink入門與實戰
- Clojure Data Structures and Algorithms Cookbook
- 產品架構評估原理與方法
- Getting Started with RethinkDB
- Python自動化運維:技術與最佳實踐
- Learning Redis