- Apache Spark Machine Learning Blueprints
- Alex Liu
- 156字
- 2021-07-16 10:39:48
Chapter 1. Spark for Machine Learning
This chapter provides an introduction to Apache Spark from a Machine Learning (ML) and data analytics perspective, and also discusses machine learning in relation to Spark computing. Here, we first present an overview of Apache Spark, as well as Spark's advantages for data analytics, in comparison to MapReduce and other computing platforms. Then we discuss five main issues, as below:
- Machine learning algorithms and libraries
- Spark RDD and dataframes
- Machine learning frameworks
- Spark pipelines
- Spark notebooks
All of the above are the most important topics that any data scientist or machine learning professional is expected to master, in order to fully take advantage of Apache Spark computing. Specifically, this chapter will cover all of the following six topics.
- Spark overview and Spark advantages
- ML algorithms and ML libraries for Spark
- Spark RDD and dataframes
- ML Frameworks, RM4Es and Spark computing
- ML workflows and Spark pipelines
- Spark notebooks introduction
推薦閱讀
- 大數(shù)據(jù)項(xiàng)目管理:從規(guī)劃到實(shí)現(xiàn)
- 大數(shù)據(jù)技術(shù)基礎(chǔ)
- 自動(dòng)檢測與轉(zhuǎn)換技術(shù)
- 網(wǎng)絡(luò)組建與互聯(lián)
- AI 3.0
- Splunk Operational Intelligence Cookbook
- 云原生架構(gòu)進(jìn)階實(shí)戰(zhàn)
- 單片機(jī)C語言應(yīng)用100例
- Dreamweaver CS6中文版多功能教材
- 自動(dòng)化生產(chǎn)線安裝與調(diào)試(三菱FX系列)(第二版)
- MATLAB-Simulink系統(tǒng)仿真超級(jí)學(xué)習(xí)手冊(cè)
- 軟件測試管理
- Arduino創(chuàng)意機(jī)器人入門:基于ArduBlock(第2版)
- Apache Spark Machine Learning Blueprints
- Cloud Native Development Patterns and Best Practices