官术网_书友最值得收藏!

Basic summary statistics

Practitioners in the field of descriptive analytics use a set of four summary statistics to quickly understand a dataset. With practice, you should be able to strengthen your intuition about each one of these statistical measurements. In fact, it's a great place to start with most problem statements that you will face. The four summary statistics are described as follows: 

  • Locations: The location or center of the data; this can be measured by the mean (average), median, or mode. The median is the point of delineation in 50% of the data, and the mode is the most occurring points, or largest part of the distribution. 
  • Spread: How the data is spread around the center; this can be measured with standard deviation, which sums the average distance from the mean of each data point, or variance, which is the square of the deviation. 
  • Shape: A description of where the center of distribution sits in relation to the mean. This is usually expressed as the skew direction. You can refer to the following diagram for a negative skew example. In the case of positive skew, the tail is simply pointed in the opposite direction. 
  • Correlation: The measurement of dependency of one variable against another. The most common measure is the Pearson correlation coefficient, which is between -1 (a full negative correlation) and +1 (a full positive correlation). A value of 0 signifies no correlation; this is usually denoted with "r". 

Take a look at the following diagram for a visualization of the points described in this section:

主站蜘蛛池模板: 太湖县| 通辽市| 徐汇区| 漳浦县| 和龙市| 综艺| 合川市| 顺平县| 西畴县| 嘉善县| 姚安县| 武陟县| 吐鲁番市| 娄底市| 岳普湖县| 凤山县| 安仁县| 新宁县| 三亚市| 色达县| 太康县| 涿州市| 双牌县| 邳州市| 寻甸| 巫山县| 盘锦市| 长宁县| 金山区| 丽江市| 托里县| 尤溪县| 昭通市| 凤阳县| 雷波县| 香港| 三江| 巫溪县| 乌审旗| 绥芬河市| 宁明县|