官术网_书友最值得收藏!

Chapter 1. Spark for Machine Learning

This chapter provides an introduction to Apache Spark from a Machine Learning (ML) and data analytics perspective, and also discusses machine learning in relation to Spark computing. Here, we first present an overview of Apache Spark, as well as Spark's advantages for data analytics, in comparison to MapReduce and other computing platforms. Then we discuss five main issues, as below:

  • Machine learning algorithms and libraries
  • Spark RDD and dataframes
  • Machine learning frameworks
  • Spark pipelines
  • Spark notebooks

All of the above are the most important topics that any data scientist or machine learning professional is expected to master, in order to fully take advantage of Apache Spark computing. Specifically, this chapter will cover all of the following six topics.

  • Spark overview and Spark advantages
  • ML algorithms and ML libraries for Spark
  • Spark RDD and dataframes
  • ML Frameworks, RM4Es and Spark computing
  • ML workflows and Spark pipelines
  • Spark notebooks introduction
主站蜘蛛池模板: 萝北县| 安多县| 临高县| 菏泽市| 宁蒗| 柘城县| 贵南县| 花莲县| 宁海县| 广汉市| 乡城县| 吐鲁番市| 黄龙县| 和硕县| 通河县| 灌南县| 江西省| 安庆市| 宝应县| 酒泉市| 合作市| 巴塘县| 济南市| 衡南县| 延吉市| 宁波市| 福贡县| 外汇| 呈贡县| 石门县| 东丰县| 汶川县| 阳信县| 土默特左旗| 金华市| 宣威市| 宝应县| 横山县| 临沭县| 汽车| 双城市|