官术网_书友最值得收藏!

Binary and multiclass classification

The first classifier we saw, the threshold classifier, was a simple binary classifier (the result is either one class or the other as a point is either above the threshold or it is not). The second classifier we used, the nearest neighbor classifier, was a naturally multiclass classifier (the output can be one of several classes).

It is often simpler to define a simple binary method than one that works on multiclass problems. However, we can reduce the multiclass problem to a series of binary decisions. This is what we did earlier in the Iris dataset in a haphazard way; we observed that it was easy to separate one of the initial classes and focused on the other two, reducing the problem to two binary decisions:

  • Is it an Iris Setosa (yes or no)?
  • If no, check whether it is an Iris Virginica (yes or no).

Of course, we want to leave this sort of reasoning to the computer. As usual, there are several solutions to this multiclass reduction.

The simplest is to use a series of "one classifier versus the rest of the classifiers". For each possible label ?, we build a classifier of the type "is this ? or something else?". When applying the rule, exactly one of the classifiers would say "yes" and we would have our solution. Unfortunately, this does not always happen, so we have to decide how to deal with either multiple positive answers or no positive answers.

Alternatively, we can build a classification tree. Split the possible labels in two and build a classifier that asks "should this example go to the left or the right bin?" We can perform this splitting recursively until we obtain a single label. The preceding diagram depicts the tree of reasoning for the Iris dataset. Each diamond is a single binary classifier. It is easy to imagine we could make this tree larger and encompass more decisions. This means that any classifier that can be used for binary classification can also be adapted to handle any number of classes in a simple way.

There are many other possible ways of turning a binary method into a multiclass one. There is no single method that is clearly better in all cases. However, which one you use normally does not make much of a difference to the final result.

Most classifiers are binary systems while many real-life problems are naturally multiclass. Several simple protocols reduce a multiclass problem to a series of binary decisions and allow us to apply the binary models to our multiclass problem.

主站蜘蛛池模板: 光泽县| 肥东县| 铜山县| 河津市| 沙田区| 当涂县| 北辰区| 施甸县| 福清市| 阿瓦提县| 乌兰察布市| 扬中市| 高台县| 温州市| 忻城县| 永年县| 兴宁市| 石城县| 南和县| 高密市| 高州市| 剑河县| 台东市| 嘉兴市| 洞头县| 巴林右旗| 靖江市| 宁晋县| 札达县| 汉中市| 保定市| 仁布县| 武义县| 紫云| 安平县| 台东县| 凤山县| 滁州市| 卓资县| 普格县| 丰镇市|