官术网_书友最值得收藏!

Evaluating classification

Is our classifier doing well? Is this better than the other one? In classification, we count how many times we classify something right and wrong. Suppose there are two possible classification labels of yes and no, then there are four possible outcomes, as shown in the following table:

The four variables:

  • True positive (hit): This indicates a yes instance correctly predicted as yes
  • True negative (correct rejection): This indicates a no instance correctly predicted as no
  • False positive (false alarm): This indicates a no instance predicted as yes
  • False negative (miss): This indicates a yes instance predicted as no

The basic two performance measures of a classifier are, firstly, classification error:

And, secondly, classification accuracy is another performance measure, as shown here:

The main problem with these two measures is that they cannot handle unbalanced classes. Classifying whether a credit card transaction is an abuse or not is an example of a problem with unbalanced classes: there are 99.99% normal transactions and just a tiny percentage of abuses. The classifier that says that every transaction is a normal one is 99.99% accurate, but we are mainly interested in those few classifications that occur very rarely.

主站蜘蛛池模板: 腾冲县| 麻栗坡县| 万宁市| 荣昌县| 洛宁县| 乡宁县| 台南市| 桂东县| 江都市| 平安县| 沁阳市| 沙河市| 双牌县| 榆社县| 万安县| 永胜县| 安溪县| 察雅县| 麻城市| 洛浦县| 浮山县| 涞源县| 高碑店市| 庐江县| 密云县| 九台市| 磐安县| 临泉县| 四子王旗| 青州市| 石城县| 罗源县| 苏尼特左旗| 分宜县| 营口市| 睢宁县| 安国市| 寻甸| 鹤庆县| 手游| 江津市|