官术网_书友最值得收藏!

Taking class imbalances into account

Class imbalance is a major problem when it comes to classification. The following diagram depicts the class densities of the five severity classes:

Figure 2.4: Class densities of the five severity classes

As we can see from the preceding chart, nearly 73% of the training data belongs to Class 0, which stands for no diabetic retinopathy condition. So if we happen to label all data points as Class 0, then we would have 73% percent accuracy. This is not desirable in patient heath conditions. We would rather have a test say a patient has a certain heath condition when it doesn't (false positive) than have a test that misses detecting a certain heath condition when it does (false negative). A 73% accuracy may mean nothing if the model learns to classify all points as belonging to Class 0.

Detecting the higher severity classes are more important than doing well on the no severity class. The problem with classification models using the log loss or the cross entropy cost function is that it favors the majority class. This is because the cross entropy error is derived from the maximum likelihood principles which tends to assign higher probability to majority classes. We can do two things:

  •  Discard data from the classes with more samples or up sample the low frequency classes to keep the distribution of samples among classes uniform. 
  • In the loss function assigns a weight to the classes in inverse proportion to their densities. This will ensure that the low frequency classes impose a higher penalty on the cost function when the model fails to classify them.

We will work with scheme two since it doesn't involve having to generate more data or throw away existing data. If we take the class weights to be proportional to the inverse of the class frequencies, we get the following class weights:

 

We will use these weights while training the classification network.

主站蜘蛛池模板: 丰宁| 凌源市| 渭南市| 临海市| 虹口区| 东阳市| 新巴尔虎右旗| 和静县| 班玛县| 信丰县| 进贤县| 临颍县| 汉川市| 东海县| 长武县| 内黄县| 隆化县| 遵义县| 阜南县| 静宁县| 依安县| 洛隆县| 梨树县| 六枝特区| 宿迁市| 武穴市| 平阴县| 芜湖市| 枝江市| 余江县| 景德镇市| 巫溪县| 纳雍县| 黄大仙区| 建宁县| 信丰县| 平泉县| 青河县| 隆昌县| 甘德县| 湘潭市|