官术网_书友最值得收藏!

Design of Sparkling Water

Sparkling Water is designed to be executed as a regular Spark application. Consequently, it is launched inside a Spark executor created after submitting the application. At this point, H2O starts services, including a distributed key-value (K/V) store and memory manager, and orchestrates them into a cloud. The topology of the created cloud follows the topology of the underlying Spark cluster.

As stated previously, Sparkling Water enables transformation between different types of RDDs/DataFrames and H2O's frame, and vice versa. When converting from a hex frame to an RDD, a wrapper is created around the hex frame to provide an RDD-like API. In this case, data is not duplicated but served directly from the underlying hex frame. Converting from an RDD/DataFrame to a H2O frame requires data duplication because it transforms data from Spark into H2O-specific storage. However, data stored in an H2O frame is heavily compressed and does not need to be preserved as an RDD anymore:

Data sharing between sparkling water and Spark
主站蜘蛛池模板: 溧阳市| 青浦区| 黄山市| 同仁县| 原平市| 鹤岗市| 平泉县| 察哈| 邯郸县| 萝北县| 泽库县| 乌鲁木齐县| 泽普县| 肃南| 仁怀市| 留坝县| 龙海市| 嘉义市| 古田县| 沧州市| 沂源县| 闻喜县| 乐山市| 双鸭山市| 长阳| 蓝山县| 中阳县| 兴义市| 奎屯市| 阳东县| 娄烦县| 祁东县| 四子王旗| 洛川县| 南汇区| 寿阳县| 西宁市| 湖州市| 白山市| 蕉岭县| 高清|