官术网_书友最值得收藏!

  • Deep Learning By Example
  • Ahmed Menshawy
  • 411字
  • 2021-06-24 18:52:41

Linear models for classification

In this section, we are going to go through logistic regression, which is one of the widely used algorithms for classification.

What's logistic regression? The simple definition of logistic regression is that it's a type of classification algorithm involving a linear discriminant.

We are going to clarify this definition in two points:

  1. Unlike linear regression, logistic regression doesn't try to estimate/predict the value of the numeric variable given a set of features or input variables. Instead, the output of the logistic regression algorithm is the probability that the given sample/observation belongs to a specific class. In simpler words, let's assume that we have a binary classification problem. In this type of problem, we have only two classes in the output variable, for example, diseased or not diseased. So, the probability that a certain sample belongs to the diseased class is P0 and the probability that a certain sample belongs to the not diseased class is P1 = 1 - P0. Thus, the output of the logistic regression algorithm is always between 0 and 1.

  2. As you probably know, there are a lot of learning algorithms for regression or classification, and each learning algorithm has its own assumption about the data samples. The ability to choose the learning algorithm that fits your data will come gradually with practice and good understanding of the subject. Thus, the central assumption of the logistic regression algorithm is that our input/feature space could be separated into two regions (one for each class) by a linear surface, which could be a line if we only have two features or a plane if we have three, and so on. The position and orientation of this boundary will be determined by your data. If your data satisfies this constraint that is separating them into regions corresponding to each class with a linear surface, then your data is said to be linearly separable. The following figure illustrates this assumption. In Figure 5, we have three dimensions, inputs, or features and two possible classes: diseased (red) and not diseased (blue). The dividing place that separates the two regions from each other is called a linear discriminant, and that’s because it's linear and it helps the model to discriminate between samples belonging to different classes:

Figure 5: Linear decision surface separating two classes

If your data samples aren't linearly separable, you can make them so by transforming your data into higher dimensional space, by adding more features.

主站蜘蛛池模板: 钦州市| 淳安县| 长武县| 新丰县| 县级市| 建德市| 泸溪县| 屏边| 贵德县| 新野县| 田阳县| 民丰县| 东光县| 宣城市| 怀集县| 岚皋县| 南投县| 沈阳市| 华容县| 莱芜市| 普陀区| 建昌县| 公主岭市| 阜南县| 临汾市| 延吉市| 孟连| 乐亭县| 桃园市| 玉林市| 清流县| 德格县| 金寨县| 鸡西市| 晴隆县| 西畴县| 正阳县| 和田县| 平度市| 威宁| 岳阳县|