官术网_书友最值得收藏!

Evaluating the model

As you saw when running the trainer component of the sample project, there are various elements of model evaluation. For each model type, there are different metrics to look at when analyzing the performance of a model.

In binary classification models like the one found in the example project, the following properties are exposed in CalibratedBiniaryClassificationMetrics that we set after calling the Evaluate method. However, first, we need to define the four prediction types in a binary classification:

  • True negative: Properly classified as negative
  • True positive: Properly classified as positive
  • False negative: Improperly classified as negative
  • False positive: Improperly classified as positive

The first metric to understand is Accuracy. As the name implies, accuracy is one of the most commonly used metrics when evaluating a model. This metric is calculated simply as the ratio of correctly classified predictions to total classifications.

The next metric to understand is Precision. Precision is defined as the proportion of true results over all the positive results in a model. For example, a precision of 1 means there were no false positives, an ideal scenario. A false positive is classifying something as positive when it should be classified as negative, as mentioned previously. A common example of a false positive is misclassifying a file as malicious when it is actually benign.

The next metric to understand is Recall. Recall is the fraction of all correct results returned by the model. For example, a recall of 1 means there were no false negatives, another ideal scenario. A false negative is classifying something as negative when it should have been classified as positive.

The next metric to understand is the F-score, which utilizes both precision and recall, producing a weighted average based on the false positives and false negatives. F-scores give another perspective on the performance of the model compared to simply looking at accuracy. The range of values is between 0 and 1, with an ideal value of 1.

Area Under the Curve, also referred to as AUC, is, as the name implies, the area under the curve plotted with true positives on the y-axis and false positives on the x-axis. For classifiers such as the model that we trained earlier in this chapter, as you saw, this returned values of between 0 and 1.

Lastly, Average Log Loss and Training Log Loss are both used to further explain the performance of the model. The average log loss is effectively expressing the penalty for wrong results in a single number by taking the difference between the true classification and the one the model predicts. Training log loss represents the uncertainty of the model using probability versus the known values. As you train your model, you will look to have a low number (lower numbers are better).

As regards the other model types, we will deep dive into how to evaluate them in their respective chapters, where we will cover regression and clustering metrics.

主站蜘蛛池模板: 盐边县| 无棣县| 仲巴县| 广平县| 江安县| 宁陵县| 嵊泗县| 桑植县| 温州市| 隆昌县| 页游| 威远县| 绍兴市| 贡觉县| 灯塔市| 鹤山市| 定州市| 金寨县| 宜章县| 临沧市| 太谷县| 嘉善县| 琼中| 女性| 永泰县| 舒城县| 瑞昌市| 宁远县| 河南省| 咸丰县| 巫山县| 富顺县| 洪湖市| 钟祥市| 天台县| 元氏县| 怀安县| 广宗县| 汝城县| 运城市| 临颍县|