官术网_书友最值得收藏!

Hill climbing and descent

We will go back to our example—the lost hill that we looked at. We want to find a way to select a set of theta parameters that is going to minimize our loss function, L. As we've already established, we need to climb or descend the hill, and understand where we are with respect to our neighboring points without having to compute everything. To do that, we need to be able to measure the slope of the curve with respect to the theta parameters. So, going back to our house example, as mentioned before, we want to know how much correct the incremental value of cost per square foot makes. Once we know that, we can start taking directional steps toward finding the best estimate. So, if you make a bad guess, you can turn around and go in exactly the other direction. So, we can either climb or descend the hill depending on our metric, which allows us to optimize the parameters of a function that we want to learn irrespective of how the function itself performs. This is a layer of abstraction. This optimization process is called gradient descent, and it supports many of the machine learning algorithms that we will discuss in this book.

The following code shows a simple example of how we can measure the gradient of a matrix with respect to theta. This example is actually a simplified snippet of the learning component of logistic regression:

import numpy as np

seed = (42)

X = np.random.RandomState(seed).rand(5, 3).round(4)

y = np.array([1, 1, 0, 1, 0])

h = (lambda X: 1. / (1. + np.exp(-X)))

theta = np.zeros(3)

lam = 0.05

def iteration(theta):

y_hat = h(X.dot(theta))

residuals = y - y_hat

gradient = X.T.dot(residuals)
theta += gradient * lam
print("y hat: %r" % y_hat.round(3).tolist())
print("Gradient: %r" % gradient.round(3).tolist())
print("New theta: %r\n" % theta.round(3).tolist())

iteration(theta)
iteration(theta)

At the very top, we randomly initialize X and y, which is not part of the algorithm. So, x here is the sigmoid function, also called the logistic function. The word logistic comes from logistic progression. This is a necessary transformation that is applied in logistic regression. Just understand that we have to apply that; it's part of the function. So, we initialize our theta vector, with respect to which we're going to compute our gradient as zeros. Again, all of them are zeros. Those are our parameters. Now, for each iteration, we're going to get our , which is our estimated y, if you recall. We get that by taking the dot product of our X matrix against our theta parameters, pushed through that logistic function, h, which is our .

Now, we want to compute the gradient of that dot product between the residuals and the input matrix, X, of our predictors. The way we compute our residuals is simply y minus , which gives the residuals. Now, we have our . How do we get the gradient? The gradient is just the dot product between the input matrix, X, and those residuals. We will use that gradient to determine which direction we need to step in. The way we do that is we add the gradient to our theta vector. Lambda regulates how quickly we step up or down that gradient. So, it's our learning rate. If you think of it as a step size—going back to that dark room example—if it's too large, it's easy to overstep the lowest point. But if it's too small, you're going to spend forever inching around the room. So, it's a bit of a balancing act, but it allows us to regulate the pace at which we update our theta values and descend our gradient. Again, this algorithm is something we will cover in the next chapter.

We get the output of the preceding code as follows:

y hat: [0.5, 0.5, 0.5, 0.5, 0.5]
Gradient: [0.395, 0.024, 0.538]
New theta: [0.02, 0.001, 0.027]

y hat: [0.507, 0.504, 0.505, 0.51, 0.505]
Gradient: [0.378, 0.012, 0.518]
New theta: [0.039, 0.002, 0.053]

This example demonstrates how our gradient or slope actually changes as we adjust our coefficients and vice versa.

In the next section, we will see how to evaluate our models and learn the cryptic train_test_split.

主站蜘蛛池模板: 承德市| 兴和县| 铜陵市| 屯留县| 长汀县| 江陵县| 宜君县| 昭苏县| 通州区| 大庆市| 余干县| 郓城县| 上思县| 鹤山市| 建始县| 青阳县| 荆门市| 临颍县| 景泰县| 西昌市| 百色市| 新蔡县| 岳普湖县| 丹凤县| 微山县| 方山县| 雅江县| 古田县| 金门县| 唐海县| 团风县| 金沙县| 卢龙县| 濉溪县| 沙洋县| 天柱县| 曲靖市| 秀山| 嘉兴市| 平乐县| 玉溪市|