官术网_书友最值得收藏!

Training the classifier

Creating a logistic regression classifier involves pretty much the same steps as setting up k-NN:

In [14]: lr = cv2.ml.LogisticRegression_create()

We then have to specify the desired training method. Here, we can choose cv2.ml.LogisticRegression_BATCH or cv2.ml.LogisticRegression_MINI_BATCH. For now, all we need to know is that we want to update the model after every data point, which can be achieved with the following code:

In [15]: lr.setTrainMethod(cv2.ml.LogisticRegression_MINI_BATCH)
... lr.setMiniBatchSize(1)

We also want to specify the number of iterations the algorithm should run before it terminates:

In [16]: lr.setIterations(100)

We can then call the train method of the object (in the exact same way as we did earlier), which will return True upon success:

In [17]: lr.train(X_train, cv2.ml.ROW_SAMPLE, y_train)
Out[17]: True

As we just saw, the goal of the training phase is to find a set of weights that best transform the feature values into an output label. A single data point is given by its four feature values (f0, f1, f2, f3). Since we have four features, we should also get four weights, so that x = w0 f0 + w1 f1 + w2 f2 + w3 f3, and ?=σ(x). However, as discussed previously, the algorithm adds an extra weight that acts as an offset or bias, so that x = w0 f0 + w1 f1 + w2 f2 + w3 f3 + w4. We can retrieve these weights as follows:

In [18]: lr.get_learnt_thetas()
Out[18]: array([[-0.04109113, -0.01968078, -0.16216497, 0.28704911, 0.11945518]], dtype=float32)

This means that the input to the logistic function is x = -0.0411 f0 - 0.0197 f1 - 0.162 f2 + 0.287 f3 + 0.119. Then, when we feed in a new data point (f0, f1, f2, f3) that belongs to class 1, the output ?=σ(x) should be close to 1. But how well does that actually work?

主站蜘蛛池模板: 新营市| 汉中市| 开原市| 焉耆| 凤庆县| 固始县| 中阳县| 通山县| 万山特区| 宜宾市| 特克斯县| 溧水县| 灌阳县| 呈贡县| 大英县| 曲水县| 平罗县| 晋中市| 连江县| 株洲市| 偃师市| 淄博市| 锡林浩特市| 宁武县| 青神县| 南乐县| 巩留县| 陆川县| 汤原县| 堆龙德庆县| 鹿邑县| 兴城市| 盐亭县| 阳城县| 巴南区| 铜陵市| 红河县| 永定县| 鹤庆县| 永靖县| 资溪县|