官术网_书友最值得收藏!

Performance improvements in Spark ML over Spark MLlib

Spark 2.0 uses Tungsten Engine, which is built using ideas of modern compilers and MPP databases. It emits optimized bytecode at runtime, which collapses the query into a single function. Hence, there is no need for virtual function calls. It also uses CPU registers to store intermediate data. This technique has been called whole stage code generation.

Reference : https://databricks.com/blog/2016/05/11/apache-spark-2-0-technical-preview-easier-faster-and-smarter.htmlSource: https://databricks.com/blog/2016/05/11/apache-spark-2-0-technical-preview-easier-faster-and-smarter.html

The upcoming table and graph show single function improvements between Spark 1.6 and Spark 2.0:

Chart comparing Performance improvements in Single line functions between Spark 1.6 and Spark 2.0
Table comparing Performance improvements in Single line functions between Spark 1.6 and Spark 2.0.
主站蜘蛛池模板: 马边| 岳普湖县| 桐乡市| 青浦区| 华安县| 永德县| 卓尼县| 甘洛县| 衢州市| 鄂温| 中宁县| 贡嘎县| 台前县| 静海县| 古浪县| 双流县| 龙南县| 邯郸市| 潜江市| 广平县| 沈阳市| 民勤县| 文安县| 全州县| 保康县| 房产| 霍邱县| 宾阳县| 望都县| 吉安县| 河池市| 武城县| 阆中市| 景泰县| 屏东市| 固安县| 平陆县| 临泽县| 陆良县| 新密市| 会昌县|