官术网_书友最值得收藏!

Design of Sparkling Water

Sparkling Water is designed to be executed as a regular Spark application. Consequently, it is launched inside a Spark executor created after submitting the application. At this point, H2O starts services, including a distributed key-value (K/V) store and memory manager, and orchestrates them into a cloud. The topology of the created cloud follows the topology of the underlying Spark cluster.

As stated previously, Sparkling Water enables transformation between different types of RDDs/DataFrames and H2O's frame, and vice versa. When converting from a hex frame to an RDD, a wrapper is created around the hex frame to provide an RDD-like API. In this case, data is not duplicated but served directly from the underlying hex frame. Converting from an RDD/DataFrame to a H2O frame requires data duplication because it transforms data from Spark into H2O-specific storage. However, data stored in an H2O frame is heavily compressed and does not need to be preserved as an RDD anymore:

Data sharing between sparkling water and Spark
主站蜘蛛池模板: 虹口区| 塘沽区| 通河县| 潞西市| 墨玉县| 元朗区| 贵南县| 大英县| 宁津县| 克山县| 阿坝县| 临朐县| 长春市| 新沂市| 环江| 丽江市| 新邵县| 交口县| 饶平县| 多伦县| 应城市| 青海省| 庆元县| 合阳县| 浦城县| 江永县| 邻水| 疏勒县| 永年县| 图们市| 富阳市| 邢台县| 顺平县| 无锡市| 肥东县| 新巴尔虎左旗| 项城市| 康乐县| 石台县| 东辽县| 托里县|