書名： Hands-On Machine Learning with ML.NET
作者名： Jarred Capellman
本章字數： 490字
更新時間： 2021-06-24 16:43:30

Evaluating the model

As you saw when running the trainer component of the sample project, there are various elements of model evaluation. For each model type, there are different metrics to look at when analyzing the performance of a model.

In binary classification models like the one found in the example project, the following properties are exposed in CalibratedBiniaryClassificationMetrics that we set after calling the Evaluate method. However, first, we need to define the four prediction types in a binary classification:

True negative: Properly classified as negative
True positive: Properly classified as positive
False negative: Improperly classified as negative
False positive: Improperly classified as positive

The first metric to understand is Accuracy. As the name implies, accuracy is one of the most commonly used metrics when evaluating a model. This metric is calculated simply as the ratio of correctly classified predictions to total classifications.

The next metric to understand is Precision. Precision is defined as the proportion of true results over all the positive results in a model. For example, a precision of 1 means there were no false positives, an ideal scenario. A false positive is classifying something as positive when it should be classified as negative, as mentioned previously. A common example of a false positive is misclassifying a file as malicious when it is actually benign.

The next metric to understand is Recall. Recall is the fraction of all correct results returned by the model. For example, a recall of 1 means there were no false negatives, another ideal scenario. A false negative is classifying something as negative when it should have been classified as positive.

The next metric to understand is the F-score, which utilizes both precision and recall, producing a weighted average based on the false positives and false negatives. F-scores give another perspective on the performance of the model compared to simply looking at accuracy. The range of values is between 0 and 1, with an ideal value of 1.

Area Under the Curve, also referred to as AUC, is, as the name implies, the area under the curve plotted with true positives on the y-axis and false positives on the x-axis. For classifiers such as the model that we trained earlier in this chapter, as you saw, this returned values of between 0 and 1.

Lastly, Average Log Loss and Training Log Loss are both used to further explain the performance of the model. The average log loss is effectively expressing the penalty for wrong results in a single number by taking the difference between the true classification and the one the model predicts. Training log loss represents the uncertainty of the model using probability versus the known values. As you train your model, you will look to have a low number (lower numbers are better).

As regards the other model types, we will deep dive into how to evaluate them in their respective chapters, where we will cover regression and clustering metrics.

官术网_书友最值得收藏!

Hands-On Machine Learning with ML.NET

Evaluating the model