- Machine Learning with Spark(Second Edition)
- Rajdeep Dua Manpreet Singh Ghotra Nick Pentreath
- 177字
- 2021-07-09 21:07:56
Spark MLlib
Apache Spark is an open-source platform for large dataset processing. It is well suited for iterative machine learning tasks as it leverages in-memory data structures such as RDDs. MLlib is Spark's machine learning library. MLlib provides functionality for various learning algorithms-supervised and unsupervised. It includes various statistical and linear algebra optimizations. It is shipped along with Apache Spark and hence saves on installation headaches like some other libraries. MLlib supports several higher languages such as Scala, Java, Python and R. It also provides a high-level API to build machine-learning pipelines.
MLlib's integration with Spark has quite a few benefits. Spark is designed for iterative computation cycles; it enables efficient implementation platform for large machine learning algorithms, as these algorithms are themselves iterative.
Any improvement in Spark's data structures results in direct gains for MLlib. Spark's large community contributions have helped bring new algorithms to MLlib faster.
Spark also has other APIs such as Pipeline APIs GraphX, which can be used in conjunction with MLlib; it makes building interesting use cases on top of MLlib easier.
- Mastering VMware vSphere 6.5
- 2018西門子工業專家會議論文集(上)
- MCSA Windows Server 2016 Certification Guide:Exam 70-741
- HBase Design Patterns
- Hands-On Cybersecurity with Blockchain
- 運動控制器與交流伺服系統的調試和應用
- 大數據時代
- 數據掘金
- 菜鳥起飛系統安裝與重裝
- Statistics for Data Science
- HTML5 Canvas Cookbook
- Dreamweaver CS6中文版多功能教材
- Redash v5 Quick Start Guide
- Raspberry Pi Projects for Kids
- 企業級Web開發實戰