- Hands-On Data Science and Python Machine Learning
- Frank Kane
- 278字
- 2021-07-15 17:15:11
Population variance versus sample variance
There is a little nuance to standard deviation and variance, and that's when you're talking about population versus sample variance. If you're working with a complete set of data, a complete set of observations, then you do exactly what I told you. You just take the average of all the squared variances from the mean and that's your variance.
However, if you're sampling your data, that is, if you're taking a subset of the data just to make computing easier, you have to do something a little bit different. Instead of dividing by the number of samples, you divide by the number of samples minus 1. Let's look at an example.
We'll use the sample data we were just studying for people standing in a line. We took the sum of the squared variances and divided by 5, that is the number of data points that we had, to get 5.04.
σ2 = (11.56 + 0.16 + 0.36 + 0.16 + 12.96) / 5 = 5.04
If we were to look at the sample variance, which is designated by S2, it is found by the sum of the squared variances divided by 4, that is (n - 1). This gives us the sample variance, which comes out to 6.3.
S2 = (11.56 + 0.16 + 0.36 + 0.16 + 12.96) / 4 = 6.3
So again, if this was some sort of sample that we took from a larger dataset, that's what you would do. If it was a complete dataset, you divide by the actual number. Okay, that's how we calculate population and sample variance, but what's the actual logic behind it?
- Rust編程:入門、實戰(zhàn)與進階
- Mastering Articulate Storyline
- Django:Web Development with Python
- Flash CS6中文版應用教程(第三版)
- Symfony2 Essentials
- Hands-On GUI Programming with C++ and Qt5
- Java并發(fā)編程之美
- 遠方:兩位持續(xù)創(chuàng)業(yè)者的點滴思考
- ROS機器人編程實戰(zhàn)
- Offer來了:Java面試核心知識點精講(框架篇)
- Web開發(fā)新體驗
- SQL Server 2014 Development Essentials
- Magento 2 Developer's Guide
- Web前端開發(fā)精品課:HTML5 Canvas開發(fā)詳解
- CISSP in 21 Days(Second Edition)