官术网_书友最值得收藏!

Evaluating classification

Is our classifier doing well? Is this better than the other one? In classification, we count how many times we classify something right and wrong. Suppose there are two possible classification labels of yes and no, then there are four possible outcomes, as shown in the following table:

The four variables:

  • True positive (hit): This indicates a yes instance correctly predicted as yes
  • True negative (correct rejection): This indicates a no instance correctly predicted as no
  • False positive (false alarm): This indicates a no instance predicted as yes
  • False negative (miss): This indicates a yes instance predicted as no

The basic two performance measures of a classifier are, firstly, classification error:

And, secondly, classification accuracy is another performance measure, as shown here:

The main problem with these two measures is that they cannot handle unbalanced classes. Classifying whether a credit card transaction is an abuse or not is an example of a problem with unbalanced classes: there are 99.99% normal transactions and just a tiny percentage of abuses. The classifier that says that every transaction is a normal one is 99.99% accurate, but we are mainly interested in those few classifications that occur very rarely.

主站蜘蛛池模板: 革吉县| 柘城县| 遂宁市| 梓潼县| 南通市| 高陵县| 盐山县| 宣恩县| 黑水县| 彭阳县| 西城区| 微山县| 芒康县| 新河县| 铁岭市| 临江市| 阜阳市| 萨迦县| 类乌齐县| 肃宁县| 通州区| 南召县| 郁南县| 长垣县| 红安县| 莱芜市| 凌海市| 米泉市| 云安县| 瑞昌市| 上高县| 娱乐| 喜德县| 建德市| 马边| 吉隆县| 盐池县| 寻乌县| 青川县| 佛学| 青岛市|