官术网_书友最值得收藏!

Error measures

In general, when working with a supervised scenario, we define a non-negative error measure em which takes two arguments (expected and predicted output) and allows us to compute a total error value over the whole dataset (made up of n samples):

This value is also implicitly dependent on the specific hypothesis H through the parameter set, therefore optimizing the error implies finding an optimal hypothesis (considering the hardness of many optimization problems, this is not the absolute best one, but an acceptable approximation). In many cases, it's useful to consider the mean square error (MSE):

Its initial value represents a starting point over the surface of a n-variables function. A generic training algorithm has to find the global minimum or a point quite close to it (there's always a tolerance to avoid an excessive number of iterations and a consequent risk of overfitting). This measure is also called loss function because its value must be minimized through an optimization problem. When it's easy to determine an element which must be maximized, the corresponding loss function will be its reciprocal.

Another useful loss function is called zero-one-loss and it's particularly efficient for binary classifications (also for one-vs-rest multiclass strategy):

This function is implicitly an indicator and can be easily adopted in loss functions based on the probability of misclassification.

A helpful interpretation of a generic (and continuous) loss function can be expressed in terms of potential energy:

The predictor is like a ball upon a rough surface: starting from a random point where energy (=error) is usually rather high, it must move until it reaches a stable equilibrium point where its energy (relative to the global minimum) is null. In the following figure, there's a schematic representation of some different situations:

Just like in the physical situation, the starting point is stable without any external perturbation, so to start the process, it's needed to provide initial kinetic energy. However, if such an energy is strong enough, then after descending over the slope the ball cannot stop in the global minimum. The residual kinetic energy can be enough to overcome the ridge and reach the right valley. If there are not other energy sources, the ball gets trapped in the plain valley and cannot move anymore. There are many techniques that have been engineered to solve this problem and avoid local minima. However, every situation must always be carefully analyzed to understand what level of residual energy (or error) is acceptable, or whether it's better to adopt a different strategy. We're going to discuss some of them in the next chapters.

主站蜘蛛池模板: 乡宁县| 修文县| 德惠市| 阿拉尔市| 宾阳县| 天峻县| 安达市| 确山县| 武威市| 土默特左旗| 商洛市| 安平县| 突泉县| 濮阳市| 南澳县| 绿春县| 渝北区| 平乡县| 清徐县| 苗栗市| 噶尔县| 泸定县| 福州市| 漳平市| 兰西县| 临沧市| 阳江市| 常州市| 长宁县| 大庆市| 西充县| 富锦市| 阳春市| 合作市| 额尔古纳市| 札达县| 澎湖县| 策勒县| 鲁山县| 嘉黎县| 尉犁县|