官术网_书友最值得收藏!

Binary and multiclass classification

The first classifier we saw, the threshold classifier, was a simple binary classifier (the result is either one class or the other as a point is either above the threshold or it is not). The second classifier we used, the nearest neighbor classifier, was a naturally multiclass classifier (the output can be one of several classes).

It is often simpler to define a simple binary method than one that works on multiclass problems. However, we can reduce the multiclass problem to a series of binary decisions. This is what we did earlier in the Iris dataset in a haphazard way; we observed that it was easy to separate one of the initial classes and focused on the other two, reducing the problem to two binary decisions:

  • Is it an Iris Setosa (yes or no)?
  • If no, check whether it is an Iris Virginica (yes or no).

Of course, we want to leave this sort of reasoning to the computer. As usual, there are several solutions to this multiclass reduction.

The simplest is to use a series of "one classifier versus the rest of the classifiers". For each possible label ?, we build a classifier of the type "is this ? or something else?". When applying the rule, exactly one of the classifiers would say "yes" and we would have our solution. Unfortunately, this does not always happen, so we have to decide how to deal with either multiple positive answers or no positive answers.

Alternatively, we can build a classification tree. Split the possible labels in two and build a classifier that asks "should this example go to the left or the right bin?" We can perform this splitting recursively until we obtain a single label. The preceding diagram depicts the tree of reasoning for the Iris dataset. Each diamond is a single binary classifier. It is easy to imagine we could make this tree larger and encompass more decisions. This means that any classifier that can be used for binary classification can also be adapted to handle any number of classes in a simple way.

There are many other possible ways of turning a binary method into a multiclass one. There is no single method that is clearly better in all cases. However, which one you use normally does not make much of a difference to the final result.

Most classifiers are binary systems while many real-life problems are naturally multiclass. Several simple protocols reduce a multiclass problem to a series of binary decisions and allow us to apply the binary models to our multiclass problem.

主站蜘蛛池模板: 玉田县| 普格县| 山丹县| 绍兴县| 通辽市| 金阳县| 姜堰市| 宜宾市| 昆山市| 沐川县| 长子县| 南阳市| 鸡泽县| 磐安县| 蒙自县| 延庆县| 崇州市| 黄龙县| 侯马市| 冀州市| 七台河市| 天峨县| 梁山县| 武平县| 平武县| 萨迦县| 北京市| 定州市| 公主岭市| 锡林郭勒盟| 乐平市| 上思县| 南通市| 井陉县| 柳州市| 日照市| 天镇县| 清流县| 南涧| 长白| 泸溪县|