官术网_书友最值得收藏!

Skews

Now let's look at how the data for the house prices are distributed:

func hist(a []float64) (*plot.Plot, error){
h, err := plotter.NewHist(plotter.Values(a), 10)
if err != nil {
return nil, err
}
p, err := plot.New()
if err != nil {
return nil, err
}

h.Normalize(1)
p.Add(h)
return p, nil
}

This section is added to the main function:

hist, err := plotHist(YsBack)
mHandleErr(err)
hist.Title.Text = "Histogram of House Prices"
mHandleErr(hist.Save(25*vg.Centimeter, 25*vg.Centimeter, "hist.png"))

The following diagram is:

Histogram of House prices

As can be noted, the histogram of the prices is a little skewed. Fortunately, we can fix that by applying a function that performs the logging of the value and then adds 1. The standard library provides a function for this: math.Log1p. So, we add the following to our main function:

for i := range YsBack {
YsBack[i] = math.Log1p(YsBack[i])
}
hist2, err := plotHist(YsBack)
mHandleErr(err)
hist2.Title.Text = "Histogram of House Prices (Processed)"
mHandleErr(hist2.Save(25*vg.Centimeter, 25*vg.Centimeter, "hist2.png"))

The following diagram is :

Histogram of House Prices (Processed)

Ahh! This looks better. We did this for all the Ys. What about any of the Xs? To do that, we will have to iterate through each column of Xs, find out if they are skewed, and if they are, we need to apply the transformation function.

This is what we add to the main function:

  it, err := native.MatrixF64(Xs)
mHandleErr(err)
for i, isCat := range datahints {
if isCat {
continue
}
skewness := skew(it, i)
if skewness > 0.75 {
log1pCol(it, i)
}
}

native.MatrixF64s takes a *tensor.Dense and converts it into a native Go iterator. The underlying backing data doesn't change, therefore if one were to write it[0][0] = 1000, the actual matrix itself would change too. This allows us to perform transformations without additional allocations. For this topic, it may not be as important; however, for larger projects, this will come to be very handy.

This also allows us to write the functions to check and mutate the matrix:

// skew returns the skewness of a column/variable
func skew(it [][]float64, col int) float64 {
a := make([]float64, 0, len(it[0]))
for _, row := range it {
for _, col := range row {
a = append(a, col)
}
}
return stat.Skew(a, nil)
}

// log1pCol applies the log1p transformation on a column
func log1pCol(it [][]float64, col int) {
for i := range it {
it[i][col] = math.Log1p(it[i][col])
}
}
主站蜘蛛池模板: 舒兰市| 清苑县| 呼和浩特市| 广州市| 玉树县| 隆林| 靖宇县| 辽中县| 眉山市| 孙吴县| 吴堡县| 滕州市| 盐池县| 广昌县| 综艺| 嘉鱼县| 临夏县| 永泰县| 汉阴县| 鹰潭市| 三台县| 师宗县| 苏尼特左旗| 双桥区| 万载县| 东源县| 德江县| 梅河口市| 志丹县| 上饶县| 兰考县| 寻甸| 右玉县| 镇康县| 安乡县| 渝北区| 左贡县| 冕宁县| 山西省| 阳泉市| 新干县|