官术网_书友最值得收藏!

Analyzing standard deviation and variance on a histogram

Let's write some code here and play with some standard deviation and variances. So If you pull up the StdDevVariance.ipynb file IPython Notebook, and follow along with me here. Please do, because there's an activity at the end that I want you to try. What we're going to do here is just like the previous example, so begin with the following code:

%matplotlib inline 
import numpy as np 
import matplotlib.pyplot as plt 
incomes = np.random.normal(100.0, 20.0, 10000) 
plt.hist(incomes, 50) 
plt.show() 

We use matplotlib to plot a histogram of some normally distributed random data, and we call it incomes. We're saying it's going to be centered around 100 (hopefully that's an hourly rate or something and not annual, or some weird denomination), with a standard deviation of 20 and 10,000 data points.

Let's go ahead and generate that by executing that above code block and plotting it as shown in the following graph:

We have 10,000 data points centered around 100. With a normal distribution and a standard deviation of 20, a measure of the spread of this data, you can see that the most common occurrence is around 100, and as we get further and further from that, things become less and less likely. The standard deviation point of 20 that we specified is around 80 and around 120. You can see in the histogram that this is the point where things start to fall off sharply, so we can say that things beyond that standard deviation boundary are unusual.

主站蜘蛛池模板: 依兰县| 库尔勒市| 江油市| 淄博市| 卢氏县| 凤冈县| 福建省| 大石桥市| 山东省| 襄樊市| 鄂伦春自治旗| 宿州市| 右玉县| 龙里县| 高阳县| 腾冲县| 永嘉县| 安宁市| 临沂市| 陇西县| 湘潭县| 麻阳| 慈溪市| 绥江县| 叶城县| 北海市| 滁州市| 防城港市| 英吉沙县| 北宁市| 上蔡县| 苗栗县| 牙克石市| 长泰县| 理塘县| 砚山县| 香格里拉县| 江达县| 铜梁县| 东兰县| 仙居县|