官术网_书友最值得收藏!

Evaluating the model

As you saw when running the trainer component of the sample project, there are various elements of model evaluation. For each model type, there are different metrics to look at when analyzing the performance of a model.

In binary classification models like the one found in the example project, the following properties are exposed in CalibratedBiniaryClassificationMetrics that we set after calling the Evaluate method. However, first, we need to define the four prediction types in a binary classification:

  • True negative: Properly classified as negative
  • True positive: Properly classified as positive
  • False negative: Improperly classified as negative
  • False positive: Improperly classified as positive

The first metric to understand is Accuracy. As the name implies, accuracy is one of the most commonly used metrics when evaluating a model. This metric is calculated simply as the ratio of correctly classified predictions to total classifications.

The next metric to understand is Precision. Precision is defined as the proportion of true results over all the positive results in a model. For example, a precision of 1 means there were no false positives, an ideal scenario. A false positive is classifying something as positive when it should be classified as negative, as mentioned previously. A common example of a false positive is misclassifying a file as malicious when it is actually benign.

The next metric to understand is Recall. Recall is the fraction of all correct results returned by the model. For example, a recall of 1 means there were no false negatives, another ideal scenario. A false negative is classifying something as negative when it should have been classified as positive.

The next metric to understand is the F-score, which utilizes both precision and recall, producing a weighted average based on the false positives and false negatives. F-scores give another perspective on the performance of the model compared to simply looking at accuracy. The range of values is between 0 and 1, with an ideal value of 1.

Area Under the Curve, also referred to as AUC, is, as the name implies, the area under the curve plotted with true positives on the y-axis and false positives on the x-axis. For classifiers such as the model that we trained earlier in this chapter, as you saw, this returned values of between 0 and 1.

Lastly, Average Log Loss and Training Log Loss are both used to further explain the performance of the model. The average log loss is effectively expressing the penalty for wrong results in a single number by taking the difference between the true classification and the one the model predicts. Training log loss represents the uncertainty of the model using probability versus the known values. As you train your model, you will look to have a low number (lower numbers are better).

As regards the other model types, we will deep dive into how to evaluate them in their respective chapters, where we will cover regression and clustering metrics.

主站蜘蛛池模板: 江孜县| 普兰店市| 韩城市| 龙门县| 云阳县| 阜新| 宝应县| 双辽市| 汾西县| 化德县| 饶河县| 和林格尔县| 新乐市| 汽车| 呼伦贝尔市| 南康市| 宜春市| 安图县| 滦平县| 广汉市| 新昌县| 东乌| 陆河县| 乡宁县| 沅陵县| 沈阳市| 九龙坡区| 石渠县| 玉环县| 德江县| 台东市| 武夷山市| 清原| 长海县| 吉水县| 临桂县| 罗甸县| 堆龙德庆县| 桦川县| 漯河市| 瓮安县|