- Mastering Machine Learning with Spark 2.x
- Alex Tellez Max Pumperla Michal Malohlava
- 187字
- 2021-07-02 18:46:08
Labeled point vector
Prior to running any supervised machine learning algorithm using Spark MLlib, we must convert our dataset into a labeled point vector which maps features to a given label/response; labels are stored as doubles which facilitates their use for both classification and regression tasks. For all binary classification problems, labels should be stored as either 0 or 1, which we confirmed from the preceding summary statistics holds true for our example.
val higgs = response.zip(features).map { case (response, features) => LabeledPoint(response, features) } higgs.setName("higgs").cache()
An example of a labeled point vector follows:
(1.0, [0.123, 0.456, 0.567, 0.678, ..., 0.789])
In the preceding example, all doubles inside the bracket are the features and the single number outside the bracket is our label. Note that we are yet to tell Spark that we are performing a classification task and not a regression task which will happen later.
- Boost.Asio C++ Network Programming(Second Edition)
- LabVIEW入門與實戰開發100例
- Cross-platform Desktop Application Development:Electron,Node,NW.js,and React
- PHP 編程從入門到實踐
- Mastering Swift 2
- Windows Presentation Foundation Development Cookbook
- Linux命令行與shell腳本編程大全(第4版)
- C語言程序設計教程
- Working with Odoo
- 計算機應用基礎案例教程
- Swift 4 Protocol-Oriented Programming(Third Edition)
- Mastering Akka
- Web編程基礎:HTML5、CSS3、JavaScript(第2版)
- AutoCAD基礎教程
- 少兒編程輕松學(全2冊)