- Python Data Mining Quick Start Guide
- Nathan Greeneltch
- 265字
- 2021-06-24 15:19:47
Basic summary statistics
Practitioners in the field of descriptive analytics use a set of four summary statistics to quickly understand a dataset. With practice, you should be able to strengthen your intuition about each one of these statistical measurements. In fact, it's a great place to start with most problem statements that you will face. The four summary statistics are described as follows:
- Locations: The location or center of the data; this can be measured by the mean (average), median, or mode. The median is the point of delineation in 50% of the data, and the mode is the most occurring points, or largest part of the distribution.
- Spread: How the data is spread around the center; this can be measured with standard deviation, which sums the average distance from the mean of each data point, or variance, which is the square of the deviation.
- Shape: A description of where the center of distribution sits in relation to the mean. This is usually expressed as the skew direction. You can refer to the following diagram for a negative skew example. In the case of positive skew, the tail is simply pointed in the opposite direction.
- Correlation: The measurement of dependency of one variable against another. The most common measure is the Pearson correlation coefficient, which is between -1 (a full negative correlation) and +1 (a full positive correlation). A value of 0 signifies no correlation; this is usually denoted with "r".
Take a look at the following diagram for a visualization of the points described in this section:

推薦閱讀
- R Data Mining
- JavaScript實例自學手冊
- 大數(shù)據(jù)專業(yè)英語
- MCSA Windows Server 2016 Certification Guide:Exam 70-741
- 小型電動機實用設計手冊
- 視覺檢測技術及智能計算
- AWS Certified SysOps Administrator:Associate Guide
- 電氣控制與PLC技術應用
- 教育機器人的風口:全球發(fā)展現(xiàn)狀及趨勢
- Unity Multiplayer Games
- 機器人人工智能
- 嵌入式GUI開發(fā)設計
- Introduction to R for Business Intelligence
- Microsoft System Center Data Protection Manager Cookbook
- 教育創(chuàng)新與創(chuàng)新人才:信息技術人才培養(yǎng)改革之路(四)