官术网_书友最值得收藏!

Chapter 4. Unsupervised Learning

Labeling a set of observations for classification or regression can be a daunting task, especially in the case of a large features set. In some cases, labeled observations are either unavailable or not possible to create. In an attempt to extract some hidden associations or structures from observations, the data scientist relies on unsupervised learning techniques to detect patterns or similarity in data.

The goal of unsupervised learning is to discover patterns of regularities and irregularities in a set of observations. These techniques are also applied in reducing the solution or features space.

There are numerous unsupervised algorithms; some are more appropriate to handle dependent features, while others generate affinity groups in the case of hidden features [4:1]. In this chapter, you will learn three of the most common unsupervised learning algorithms:

  • K-means: Clustering observed features
  • Expectation-Maximization (EM): Clustering observed and latent features
  • Function approximation

Any of these algorithms can be applied to technical analysis or fundamental analysis. Fundamental analyses of financial ratios and technical analyses of price movements are described in the Technical analysis section under Finances 101 in the Appendix . The K-means algorithm is fully implemented in Scala, while the EM and principal components analyses leverage the Apache commons math library.

The chapter concludes with a brief overview of dimension reduction techniques for non-linear models.

主站蜘蛛池模板: 湖北省| 沿河| 大庆市| 德安县| 侯马市| 庄河市| 洛南县| 福海县| 刚察县| 胶南市| 浦江县| 如东县| 达尔| 梓潼县| 长兴县| 五河县| 平利县| 皋兰县| 金溪县| 靖边县| 东乌珠穆沁旗| 惠安县| 手游| 彭州市| 云林县| 东兴市| 平塘县| 周宁县| 阳信县| 米泉市| 西盟| 怀柔区| 新平| 乌海市| 丹寨县| 赤壁市| 泽库县| 阜南县| 冕宁县| 诸城市| 曲麻莱县|