- Mastering Machine Learning with Spark 2.x
- Alex Tellez Max Pumperla Michal Malohlava
- 167字
- 2021-07-02 18:46:05
Design of Sparkling Water
Sparkling Water is designed to be executed as a regular Spark application. Consequently, it is launched inside a Spark executor created after submitting the application. At this point, H2O starts services, including a distributed key-value (K/V) store and memory manager, and orchestrates them into a cloud. The topology of the created cloud follows the topology of the underlying Spark cluster.
As stated previously, Sparkling Water enables transformation between different types of RDDs/DataFrames and H2O's frame, and vice versa. When converting from a hex frame to an RDD, a wrapper is created around the hex frame to provide an RDD-like API. In this case, data is not duplicated but served directly from the underlying hex frame. Converting from an RDD/DataFrame to a H2O frame requires data duplication because it transforms data from Spark into H2O-specific storage. However, data stored in an H2O frame is heavily compressed and does not need to be preserved as an RDD anymore:

- 玩轉(zhuǎn)Scratch少兒趣味編程
- 深入實(shí)踐Spring Boot
- Mastering JavaScript Design Patterns(Second Edition)
- Python機(jī)器學(xué)習(xí)算法: 原理、實(shí)現(xiàn)與案例
- Learning PHP 7
- 從零開始學(xué)Android開發(fā)
- Instant GLEW
- LabVIEW案例實(shí)戰(zhàn)
- Mahout實(shí)踐指南
- Learning Adobe Muse
- 零基礎(chǔ)學(xué)Python爬蟲、數(shù)據(jù)分析與可視化從入門到精通
- C#程序設(shè)計(jì)自學(xué)經(jīng)典
- Drush for Developers(Second Edition)
- Building Android Games with Cocos2d-x
- 公安計(jì)算機(jī)應(yīng)用基礎(chǔ)