官术网_书友最值得收藏!

The conditional expectation functions

Instead, let's do what we originally set out to do: explore the CEFs of the variables. Fortunately, we already have the necessary data structures (in other words, the index), so writing the function to find the CEF is relatively easy.

The following is the code block:

func CEF(Ys []float64, col int, index []map[string][]int) map[string]float64 {
retVal := make(map[string]float64)
for k, v := range index[col] {
var mean float64
for _, i := range v {
mean += Ys[i]
}
mean /= float64(len(v))
retVal[k]=mean
}
return retVal
}

This function finds the conditionally expected house price when a variable is held fixed. We can do an exploration of all the variables, but for the purpose of this chapter, I shall only share the exploration of one –the yearBuilt variable—as an example.

Now, YearBuilt is an interesting variable to dive deep into. It's a categorical variable (1950.5 makes no sense), but it's totally orderable as well (1,945 is smaller than 1,950). And there are many values of YearBuilt. So, instead of printing it out, we shall plot it out with the following function:

// plotCEF plots the CEF. This is a simple plot with only the CEF. 
// More advanced plots can be also drawn to expose more nuance in understanding the data.
func plotCEF(m map[string]float64) (*plot.Plot, error) {
ordered := make([]string, 0, len(m))
for k := range m {
ordered = append(ordered, k)
}
sort.Strings(ordered)

p, err := plot.New()
if err != nil {
return nil, err
}

points := make(plotter.XYs, len(ordered))
for i, val := range ordered {
// if val can be converted into a float, we'll use it
// otherwise, we'll stick with using the index
points[i].X = float64(i)
if x, err := strconv.ParseFloat(val, 64); err == nil {
points[i].X = x
}

points[i].Y = m[val]
}
if err := plotutil.AddLinePoints(p, "CEF", points); err != nil {
return nil, err
}
return p, nil
}

Our ever-growing main function now has this appended to it:

ofInterest := 19 // variable of interest is in column 19
cef := CEF(YsBack, ofInterest, indices)
plt, err := plotCEF(cef)
mHandleErr(err)
plt.Title.Text = fmt.Sprintf("CEF for %v", hdr[ofInterest])
plt.X.Label.Text = hdr[ofInterest]
plt.Y.Label.Text = "Conditionally Expected House Price"
mHandleErr(plt.Save(25*vg.Centimeter, 25*vg.Centimeter, "CEF.png"))

Running the program yields the following chart:

conditional  expectation  functions for Yearbuilt

Upon inspecting the chart, I must confess that I was a little surprised. I'm not particularly familiar with real estate, but my initial instincts were that older houses would cost more—houses, in my mind, age like fine wine; the older the house, the more expensive it would be. Clearly this is not the case. Oh well, live and learn.

The CEF exploration should be done for as many variables as possible. I am merely eliding for the sake of brevity in this book.

主站蜘蛛池模板: 荣成市| 衡阳县| 沿河| 蒙城县| 漳浦县| 临朐县| 专栏| 祥云县| 潢川县| 南安市| 永泰县| 睢宁县| 阿城市| 蒲城县| 双牌县| 新宾| 信宜市| 深州市| 龙川县| 抚宁县| 岐山县| 饶阳县| 辽宁省| 湘潭县| 高邮市| 遵义县| 怀仁县| 和林格尔县| 延吉市| 封开县| 电白县| 错那县| 永年县| 瓦房店市| 徐汇区| 郓城县| 监利县| 孝感市| 安顺市| 马公市| 凤翔县|