官术网_书友最值得收藏!

The statistical approach versus the machine learning approach

In 2001, Leo Breiman published a paper titled Statistical Modeling: The Two Cultures (http://projecteuclid.org/euclid.ss/1009213726) that underlined the differences between the statistical approach focused on validation and explanation of the underlying process in the data and the machine learning approach, which is more concerned with the results.

Roughly put, a classic statistical analysis follows steps such as the following:

  1. A hypothesis called the null hypothesis is stated. This null hypothesis usually states that the observation is due to randomness.
  2. The probability (or p-value) of the event under the null hypothesis is then calculated.
  3. If that probability is below a certain threshold (usually p < 0.05), then the null hypothesis is rejected, which means that the observation is not a random fluke.

p> 0.05 does not imply that the null hypothesis is true. It only means that you cannot reject it, as the probability of the observation happening by chance is not large enough.

This methodology is geared toward explaining and discovering the influencing factors of the phenomenon. The goal here is to establish/build a somewhat static and fully known model that will fit observations as well as possible and, therefore, will be able to predict future patterns, behaviors, and observations.

In the machine learning approach, in predictive analytics, an explicit representation of the model is not the focus. The goal is to build the best model for the prediction period, and the model builds itself from the observations. The internals of the models are not explicit. This machine learning approach is called a black box model.

By removing the need for explicit modeling of the data, the ML approach has a stronger potential for predictions. ML is focused on making the most accurate predictions possible by minimizing the prediction error of a model at the expense of explainability. 

主站蜘蛛池模板: 米林县| 沈丘县| 南开区| 云龙县| 门源| 阿合奇县| 徐水县| 新营市| 台东市| 钟山县| 那坡县| 铜梁县| 舒城县| 贞丰县| 荥经县| 滕州市| 安泽县| 庄浪县| 广水市| 凉城县| 丽江市| 屯昌县| 北川| 驻马店市| 旬邑县| 土默特右旗| 藁城市| 内江市| 吕梁市| 阜南县| 巍山| 天水市| 舒城县| 邹城市| 白沙| 庆云县| 乃东县| 西城区| 治县。| 德昌县| 喀喇|