官术网_书友最值得收藏!

Evaluating the model

Since it's a binary classification problem, we need the BinaryClassificationEvaluator() estimator to evaluate the model's performance on the test set:

val evaluator = new BinaryClassificationEvaluator()
.setLabelCol("label")

Now that the training is completed and we have a trained decision tree model, we can evaluate the trained model on the test set:

val predictionDF = dtModel.transform(testDF)

Finally, we compute the classification accuracy:

val accuracy = evaluator.evaluate(predictionDF)
println("Accuracy = " + accuracy)

You should experience about 96% classification accuracy:

Accuracy =  0.9675436785432

Finally, we stop the SparkSession by invoking the stop() method:

spark.stop()

We have managed to achieve about 96% accuracy with minimum effort. However, there are other performance metrics such as precision, recall, and F1 measure. We will discuss them in upcoming chapters. Also, if you're a newbie to ML and haven't understood all the steps in this example, don't worry. We'll recap all of these steps in other chapters with various other examples.

主站蜘蛛池模板: 洛宁县| 沙洋县| 手机| 手游| 托里县| 广安市| 嘉祥县| 新巴尔虎左旗| 泗水县| 平顺县| 成都市| 布尔津县| 郴州市| 台湾省| 通许县| 巴青县| 额尔古纳市| 左贡县| 贵州省| 常山县| 松滋市| 油尖旺区| 靖西县| 天水市| 正定县| 胶南市| 新野县| 琼海市| 惠水县| 宣威市| 纳雍县| 吉木萨尔县| 保德县| 思茅市| 黔江区| 咸宁市| 麻江县| 永善县| 聂拉木县| 彰化县| 久治县|