官术网_书友最值得收藏!

A bidimensional example

Let's consider a small dataset built by adding some uniform noise to the points belonging to a segment bounded between -6 and 6. The original equation is: y = x + 2 + n, where n is a noise term.

In the following figure, there's a plot with a candidate regression function:

As we're working on a plane, the regressor we're looking for is a function of only two parameters:

In order to fit our model, we must find the best parameters and to do that we choose an ordinary least squares approach. The loss function to minimize is:

With an analytic approach, in order to find the global minimum, we must impose:

So (for simplicity, it accepts a vector containing both variables):

import numpy as np

def loss(v):
e = 0.0
for i in range(nb_samples):
e += np.square(v[0] + v[1]*X[i] - Y[i])
return 0.5 * e

And the gradient can be defined as:

def gradient(v):
g = np.zeros(shape=2)
for i in range(nb_samples):
g[0] += (v[0] + v[1]*X[i] - Y[i])
g[1] += ((v[0] + v[1]*X[i] - Y[i]) * X[i])
return g

The optimization problem can now be solved using SciPy:

from scipy.optimize import minimize

>>> minimize(fun=loss, x0=[0.0, 0.0], jac=gradient, method='L-BFGS-B')
fun: 9.7283268345966025
hess_inv: <2x2 LbfgsInvHessProduct with dtype=float64>
jac: array([ 7.28577538e-06, -2.35647522e-05])
message: 'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
nfev: 8
nit: 7
status: 0
success: True
x: array([ 2.00497209, 1.00822552])

As expected, the regression denoised our dataset, rebuilding the original equation: y = x + 2.

主站蜘蛛池模板: 利津县| 剑阁县| 屏边| 三台县| 疏勒县| 南平市| 奉节县| 泸水县| 乐东| 东乡族自治县| 无锡市| 青龙| 景泰县| 牟定县| 涡阳县| 巢湖市| 南汇区| 黄平县| 拉萨市| 宁阳县| 开原市| 简阳市| 上杭县| 南木林县| 和政县| 平顶山市| 乌苏市| 东阳市| 呼图壁县| 沈丘县| 新闻| 韶关市| 咸阳市| 永和县| 明水县| 肇庆市| 阿拉尔市| 沧州市| 阳新县| 永州市| 永昌县|