官术网_书友最值得收藏!

SVM

Now we are ready to understand SVMs. SVM is an algorithm that enables us to make use of it for both classification and regression. Given a set of examples, it builds a model to assign a group of observations into one category and others into a second category. It is a non-probabilistic linear classifier. Training data being linearly separable is the key here. All the observations or training data are a representation of vectors that are mapped into a space and SVM tries to classify them by using a margin that has to be as wide as possible:

Let's say there are two classes A and B as in the preceding screenshot.

And from the preceding section, we have learned the following:

g(x) = w. x + b

Where:

  • w: Weight vector that decides the orientation of the hyperplane
  • b: Bias term that decides the position of the hyperplane in n-dimensional space by biasing it

The preceding equation is also called a linear discriminant function. If there is a vector x1 that lies on the positive side of the hyperplane, the equation becomes the following:

g(x1)= w.x1 +b >0 

The equation will become the following:

g(x1)<0

If x1 lies on the positive side of the hyperplane.

What if g(x1)=0? Can you guess where x1 would be? Well, yes, it would be on the hyperplane, since our goal is to find out the class of the vector.

So, if g(x1)>0 => x1 belongs to Class Ag(x1)<0 => x1 belongs to Class B.

Here, it's evident that we can find out the classification by using the previous equation. But can you see the issue in it? Let's say the boundary line is like the following plot:

Even in the preceding scenario, we are able to classify those feature vectors here. But is it desirable? What can be seen here is that the boundary line or the classifier is close to the Class B. It implies that it brings in a large bias in the favor of Class A but penalizes Class B. As a result of that, due to any disturbances in the vectors close to the boundary, they might cross over and become part of Class A, which might not be correct. Hence, our goal is to find an optimal classifier that has got the widest margin, like what is shown in the following plot: 

Through SVM, we are attempting to create a boundary or hyperplane such that the distance from each of the feature vectors to the boundary is maximized so that any slight noise or disturbance won't cause the change in classification. So, in this scenario, if we try to bring in certain yi which happens to be the class belonging to xi, we get the following: 

yi= ± 1

yi (w.xi + b) will always be greater than 0. yi(w.xi + b) >0 because when x∈ class A, w.xi +b>0 then yi>0, so the whole term will be positive. Also, if x∈ class B, w.xi + b<0 then yi<0, and it will make the term positive.

So, now if we have to redesign it, we say the following:

w.xi + b> γ where γ is the measure of the distance of hyperplane from xi.

And if there is a hyperplane w.x + b = 0, then the distance of point x from the preceding hyperplane is as follows:

 w.x + b/ ||w||

Hence, as mentioned previously:

w.x + b/ ||w|| ≥ γ

w.x + b ≥ γ.||w||

On performing proper scaling, we can say the following:

w.x + b ≥ 1 (since γ.||w|| = 1)

It implies that if there is a classification to be arrived at based on the previous result, it follows this:

w.x + b ≥ 1 if x ∈ class A and

w.x + b ≤ -1 if x ∈ class B

And now, again, if we bring in a class belonging to yi here, the equation becomes the following:

yi (w.xi + b) ≥ 1

But, if yi (w.xi + b) = 1, xi is a support vector. Next, we will learn what a support vector is.

主站蜘蛛池模板: 凯里市| 工布江达县| 化州市| 霍林郭勒市| 土默特左旗| 达日县| 古田县| 大渡口区| 友谊县| 万州区| 界首市| 双桥区| 元谋县| 太保市| 广灵县| 时尚| 南昌县| 新龙县| 济源市| 江油市| 红原县| 德庆县| 民县| 济阳县| 利辛县| 茌平县| 孙吴县| 故城县| 武乡县| 金秀| 凉山| 昭通市| 招远市| 溧水县| 深泽县| 会东县| 平顶山市| 阳信县| 昌江| 鄂州市| 玉树县|