官术网_书友最值得收藏!

Skews

Now let's look at how the data for the house prices are distributed:

func hist(a []float64) (*plot.Plot, error){
h, err := plotter.NewHist(plotter.Values(a), 10)
if err != nil {
return nil, err
}
p, err := plot.New()
if err != nil {
return nil, err
}

h.Normalize(1)
p.Add(h)
return p, nil
}

This section is added to the main function:

hist, err := plotHist(YsBack)
mHandleErr(err)
hist.Title.Text = "Histogram of House Prices"
mHandleErr(hist.Save(25*vg.Centimeter, 25*vg.Centimeter, "hist.png"))

The following diagram is:

Histogram of House prices

As can be noted, the histogram of the prices is a little skewed. Fortunately, we can fix that by applying a function that performs the logging of the value and then adds 1. The standard library provides a function for this: math.Log1p. So, we add the following to our main function:

for i := range YsBack {
YsBack[i] = math.Log1p(YsBack[i])
}
hist2, err := plotHist(YsBack)
mHandleErr(err)
hist2.Title.Text = "Histogram of House Prices (Processed)"
mHandleErr(hist2.Save(25*vg.Centimeter, 25*vg.Centimeter, "hist2.png"))

The following diagram is :

Histogram of House Prices (Processed)

Ahh! This looks better. We did this for all the Ys. What about any of the Xs? To do that, we will have to iterate through each column of Xs, find out if they are skewed, and if they are, we need to apply the transformation function.

This is what we add to the main function:

  it, err := native.MatrixF64(Xs)
mHandleErr(err)
for i, isCat := range datahints {
if isCat {
continue
}
skewness := skew(it, i)
if skewness > 0.75 {
log1pCol(it, i)
}
}

native.MatrixF64s takes a *tensor.Dense and converts it into a native Go iterator. The underlying backing data doesn't change, therefore if one were to write it[0][0] = 1000, the actual matrix itself would change too. This allows us to perform transformations without additional allocations. For this topic, it may not be as important; however, for larger projects, this will come to be very handy.

This also allows us to write the functions to check and mutate the matrix:

// skew returns the skewness of a column/variable
func skew(it [][]float64, col int) float64 {
a := make([]float64, 0, len(it[0]))
for _, row := range it {
for _, col := range row {
a = append(a, col)
}
}
return stat.Skew(a, nil)
}

// log1pCol applies the log1p transformation on a column
func log1pCol(it [][]float64, col int) {
for i := range it {
it[i][col] = math.Log1p(it[i][col])
}
}
主站蜘蛛池模板: 牡丹江市| 北京市| 靖远县| 北京市| 民权县| 积石山| 临朐县| 周至县| 高碑店市| 襄汾县| 东阳市| 高要市| 杂多县| 运城市| 澳门| 三亚市| 怀宁县| 连平县| 平舆县| 滕州市| 平顺县| 辉南县| 抚顺市| 车致| 三原县| 佛冈县| 繁昌县| 三门县| 克什克腾旗| 卢氏县| 新民市| 勐海县| 永丰县| 武川县| 淳化县| 汉阴县| 德阳市| 宿迁市| 察隅县| 浙江省| 毕节市|