官术网_书友最值得收藏!

Evaluating classification

Is our classifier doing well? Is this better than the other one? In classification, we count how many times we classify something right and wrong. Suppose there are two possible classification labels of yes and no, then there are four possible outcomes, as shown in the following table:

The four variables:

  • True positive (hit): This indicates a yes instance correctly predicted as yes
  • True negative (correct rejection): This indicates a no instance correctly predicted as no
  • False positive (false alarm): This indicates a no instance predicted as yes
  • False negative (miss): This indicates a yes instance predicted as no

The basic two performance measures of a classifier are, firstly, classification error:

And, secondly, classification accuracy is another performance measure, as shown here:

The main problem with these two measures is that they cannot handle unbalanced classes. Classifying whether a credit card transaction is an abuse or not is an example of a problem with unbalanced classes: there are 99.99% normal transactions and just a tiny percentage of abuses. The classifier that says that every transaction is a normal one is 99.99% accurate, but we are mainly interested in those few classifications that occur very rarely.

主站蜘蛛池模板: 三都| 沁阳市| 马尔康县| 昔阳县| 灌南县| 沙坪坝区| 常熟市| 大姚县| 公安县| 桃园县| 南华县| 驻马店市| 阿图什市| 革吉县| 郁南县| 泗水县| 阿巴嘎旗| 芜湖县| 泰来县| 宜川县| 全椒县| 兴化市| 澎湖县| 青神县| 太谷县| 延吉市| 松滋市| 日喀则市| 贺兰县| 林芝县| 湄潭县| 河北省| 黄陵县| 桐柏县| 屯留县| 甘德县| 桦川县| 沐川县| 囊谦县| 竹山县| 易门县|