官术网_书友最值得收藏!

Performance improvements in Spark ML over Spark MLlib

Spark 2.0 uses Tungsten Engine, which is built using ideas of modern compilers and MPP databases. It emits optimized bytecode at runtime, which collapses the query into a single function. Hence, there is no need for virtual function calls. It also uses CPU registers to store intermediate data. This technique has been called whole stage code generation.

Reference : https://databricks.com/blog/2016/05/11/apache-spark-2-0-technical-preview-easier-faster-and-smarter.htmlSource: https://databricks.com/blog/2016/05/11/apache-spark-2-0-technical-preview-easier-faster-and-smarter.html

The upcoming table and graph show single function improvements between Spark 1.6 and Spark 2.0:

Chart comparing Performance improvements in Single line functions between Spark 1.6 and Spark 2.0
Table comparing Performance improvements in Single line functions between Spark 1.6 and Spark 2.0.
主站蜘蛛池模板: 庆阳市| 布拖县| 微博| 凯里市| 松阳县| 大渡口区| 环江| 奉贤区| 古蔺县| 沭阳县| 岳阳县| 吴川市| 崇文区| 临武县| 喀喇| 开鲁县| 璧山县| 衡南县| 陆丰市| 内乡县| 集安市| 馆陶县| 武邑县| 江达县| 同德县| 重庆市| 乌海市| 锡林浩特市| 临邑县| 朝阳市| 栾城县| 若尔盖县| 布拖县| 泰宁县| 塘沽区| 嵩明县| 建平县| 贞丰县| 陵川县| 永昌县| 无锡市|