官术网_书友最值得收藏!

Hill climbing and descent

We will go back to our example—the lost hill that we looked at. We want to find a way to select a set of theta parameters that is going to minimize our loss function, L. As we've already established, we need to climb or descend the hill, and understand where we are with respect to our neighboring points without having to compute everything. To do that, we need to be able to measure the slope of the curve with respect to the theta parameters. So, going back to our house example, as mentioned before, we want to know how much correct the incremental value of cost per square foot makes. Once we know that, we can start taking directional steps toward finding the best estimate. So, if you make a bad guess, you can turn around and go in exactly the other direction. So, we can either climb or descend the hill depending on our metric, which allows us to optimize the parameters of a function that we want to learn irrespective of how the function itself performs. This is a layer of abstraction. This optimization process is called gradient descent, and it supports many of the machine learning algorithms that we will discuss in this book.

The following code shows a simple example of how we can measure the gradient of a matrix with respect to theta. This example is actually a simplified snippet of the learning component of logistic regression:

import numpy as np

seed = (42)

X = np.random.RandomState(seed).rand(5, 3).round(4)

y = np.array([1, 1, 0, 1, 0])

h = (lambda X: 1. / (1. + np.exp(-X)))

theta = np.zeros(3)

lam = 0.05

def iteration(theta):

y_hat = h(X.dot(theta))

residuals = y - y_hat

gradient = X.T.dot(residuals)
theta += gradient * lam
print("y hat: %r" % y_hat.round(3).tolist())
print("Gradient: %r" % gradient.round(3).tolist())
print("New theta: %r\n" % theta.round(3).tolist())

iteration(theta)
iteration(theta)

At the very top, we randomly initialize X and y, which is not part of the algorithm. So, x here is the sigmoid function, also called the logistic function. The word logistic comes from logistic progression. This is a necessary transformation that is applied in logistic regression. Just understand that we have to apply that; it's part of the function. So, we initialize our theta vector, with respect to which we're going to compute our gradient as zeros. Again, all of them are zeros. Those are our parameters. Now, for each iteration, we're going to get our , which is our estimated y, if you recall. We get that by taking the dot product of our X matrix against our theta parameters, pushed through that logistic function, h, which is our .

Now, we want to compute the gradient of that dot product between the residuals and the input matrix, X, of our predictors. The way we compute our residuals is simply y minus , which gives the residuals. Now, we have our . How do we get the gradient? The gradient is just the dot product between the input matrix, X, and those residuals. We will use that gradient to determine which direction we need to step in. The way we do that is we add the gradient to our theta vector. Lambda regulates how quickly we step up or down that gradient. So, it's our learning rate. If you think of it as a step size—going back to that dark room example—if it's too large, it's easy to overstep the lowest point. But if it's too small, you're going to spend forever inching around the room. So, it's a bit of a balancing act, but it allows us to regulate the pace at which we update our theta values and descend our gradient. Again, this algorithm is something we will cover in the next chapter.

We get the output of the preceding code as follows:

y hat: [0.5, 0.5, 0.5, 0.5, 0.5]
Gradient: [0.395, 0.024, 0.538]
New theta: [0.02, 0.001, 0.027]

y hat: [0.507, 0.504, 0.505, 0.51, 0.505]
Gradient: [0.378, 0.012, 0.518]
New theta: [0.039, 0.002, 0.053]

This example demonstrates how our gradient or slope actually changes as we adjust our coefficients and vice versa.

In the next section, we will see how to evaluate our models and learn the cryptic train_test_split.

主站蜘蛛池模板: 隆尧县| 常州市| 安福县| 海原县| 中卫市| 惠安县| 历史| 独山县| 灵川县| 池州市| 大姚县| 赣榆县| 定结县| 上虞市| 上高县| 乃东县| 泾源县| 洛川县| 新晃| 民乐县| 康平县| 芜湖县| 益阳市| 南部县| 和硕县| 长兴县| 三门县| 鹤山市| 花垣县| 磐安县| 泗洪县| 承德市| 鄂托克前旗| 应城市| 普兰店市| 含山县| 旅游| 宜州市| 邳州市| 宜川县| 吴江市|