- Hands-On Data Science and Python Machine Learning
- Frank Kane
- 230字
- 2021-07-15 17:15:11
Identifying outliers with standard deviation
Here's a histogram of the actual data we were looking at in the preceding example for calculating variance.

Now we see that the number 4 occurred twice in our dataset, and then we had one 1, one 5, and one 8.
The standard deviation is usually used as a way to think about how to identify outliers in your dataset. If I say if I'm within one standard deviation of the mean of 4.4, that's considered to be kind of a typical value in a normal distribution. However, you can see in the preceding diagram, that the numbers 1 and 8 actually lie outside of that range. So if I take 4.4 plus or minus 2.24, we end up around 7 and 2, and 1 and 8 both fall outside of that range of a standard deviation. So we can say mathematically, that 1 and 8 are outliers. We don't have to guess and eyeball it. Now there is still a judgment call as to what you consider an outlier in terms of how many standard deviations a data point is from the mean.
So that's something you'll see standard deviation used for in the real world.
- Oracle從新手到高手
- 樂高機器人設(shè)計技巧:EV3結(jié)構(gòu)設(shè)計與編程指導(dǎo)
- Unity Game Development Scripting
- Unreal Engine 4 Shaders and Effects Cookbook
- Learning OpenStack Networking(Neutron)(Second Edition)
- HTML5+CSS3 Web前端開發(fā)技術(shù)(第2版)
- Mastering Backbone.js
- Python爬蟲、數(shù)據(jù)分析與可視化:工具詳解與案例實戰(zhàn)
- 大學計算機基礎(chǔ)
- ASP.NET 4.0 Web程序設(shè)計
- Java程序設(shè)計教程
- Clojure Data Structures and Algorithms Cookbook
- Java程序性能優(yōu)化實戰(zhàn)
- VB語言程序設(shè)計教程(第2版)
- 零基礎(chǔ)學西門子PLC編程:入門、提高、應(yīng)用、實例