官术网_书友最值得收藏!

SVM

Now we are ready to understand SVMs. SVM is an algorithm that enables us to make use of it for both classification and regression. Given a set of examples, it builds a model to assign a group of observations into one category and others into a second category. It is a non-probabilistic linear classifier. Training data being linearly separable is the key here. All the observations or training data are a representation of vectors that are mapped into a space and SVM tries to classify them by using a margin that has to be as wide as possible:

Let's say there are two classes A and B as in the preceding screenshot.

And from the preceding section, we have learned the following:

g(x) = w. x + b

Where:

  • w: Weight vector that decides the orientation of the hyperplane
  • b: Bias term that decides the position of the hyperplane in n-dimensional space by biasing it

The preceding equation is also called a linear discriminant function. If there is a vector x1 that lies on the positive side of the hyperplane, the equation becomes the following:

g(x1)= w.x1 +b >0 

The equation will become the following:

g(x1)<0

If x1 lies on the positive side of the hyperplane.

What if g(x1)=0? Can you guess where x1 would be? Well, yes, it would be on the hyperplane, since our goal is to find out the class of the vector.

So, if g(x1)>0 => x1 belongs to Class Ag(x1)<0 => x1 belongs to Class B.

Here, it's evident that we can find out the classification by using the previous equation. But can you see the issue in it? Let's say the boundary line is like the following plot:

Even in the preceding scenario, we are able to classify those feature vectors here. But is it desirable? What can be seen here is that the boundary line or the classifier is close to the Class B. It implies that it brings in a large bias in the favor of Class A but penalizes Class B. As a result of that, due to any disturbances in the vectors close to the boundary, they might cross over and become part of Class A, which might not be correct. Hence, our goal is to find an optimal classifier that has got the widest margin, like what is shown in the following plot: 

Through SVM, we are attempting to create a boundary or hyperplane such that the distance from each of the feature vectors to the boundary is maximized so that any slight noise or disturbance won't cause the change in classification. So, in this scenario, if we try to bring in certain yi which happens to be the class belonging to xi, we get the following: 

yi= ± 1

yi (w.xi + b) will always be greater than 0. yi(w.xi + b) >0 because when x∈ class A, w.xi +b>0 then yi>0, so the whole term will be positive. Also, if x∈ class B, w.xi + b<0 then yi<0, and it will make the term positive.

So, now if we have to redesign it, we say the following:

w.xi + b> γ where γ is the measure of the distance of hyperplane from xi.

And if there is a hyperplane w.x + b = 0, then the distance of point x from the preceding hyperplane is as follows:

 w.x + b/ ||w||

Hence, as mentioned previously:

w.x + b/ ||w|| ≥ γ

w.x + b ≥ γ.||w||

On performing proper scaling, we can say the following:

w.x + b ≥ 1 (since γ.||w|| = 1)

It implies that if there is a classification to be arrived at based on the previous result, it follows this:

w.x + b ≥ 1 if x ∈ class A and

w.x + b ≤ -1 if x ∈ class B

And now, again, if we bring in a class belonging to yi here, the equation becomes the following:

yi (w.xi + b) ≥ 1

But, if yi (w.xi + b) = 1, xi is a support vector. Next, we will learn what a support vector is.

主站蜘蛛池模板: 绥德县| 卓资县| 阿拉善右旗| 陕西省| 靖州| 称多县| 西乡县| 甘谷县| 白城市| 图片| 大厂| 福建省| 都江堰市| 化德县| 山阳县| 修水县| 西乡县| 寻甸| 湘乡市| 拉萨市| 高邮市| 漳平市| 松原市| 凤庆县| 台安县| 淮滨县| 晋宁县| 大同市| 长垣县| 深圳市| 泸州市| 百色市| 班戈县| 石首市| 康乐县| 库尔勒市| 新晃| 杭锦后旗| 饶河县| 甘孜县| 崇仁县|