官术网_书友最值得收藏!

Data analysis packages for Scala

By data analysis packages, we mean software designed for analyzing data in some way. A simple statistical regression would be an example. Software implementing machine-learning algorithms would be another example.

Saddle

Saddle is Scala's answer to R and Python's pandas package. It supports reading in structured data in a variety of different formats, including CSV and HDF5. The data can be loaded into frames and then manipulated as you would in other similar software. Statistical analysis can be performed, and you can build your own statistical analysis methods on top of the data structures provided by Saddle. Saddle is examined in detail in a separate chapter dedicated to it. It can be found at the following website:

https://saddle.github.io/

MLlib

Apache's MLlib library provides machine learning algorithms for the Spark platform. The library can be accessed from Scala as well as from Java and Python. It supports basic statistical methods for data analysis, various regression and classification methods, clustering via k-means, dimensionality reduction, and optimization methods. The number of algorithms in the library is constantly growing. The MLib library can be found at the following website:

http://spark.apache.org/mllib/

主站蜘蛛池模板: 屏山县| 香港 | 长寿区| 托里县| 惠东县| 广水市| 和顺县| 湘潭县| 盈江县| 泌阳县| 南乐县| 二连浩特市| 辉县市| 兴化市| 天水市| 进贤县| 吴江市| 桐乡市| 平武县| 左云县| 肇源县| 巴南区| 济源市| 玉屏| 安达市| 宜兴市| 克东县| 晋宁县| 大石桥市| 栾城县| 南通市| 盐城市| 客服| 太康县| 洛浦县| 泉州市| 黔江区| 茂名市| 体育| 潼关县| 永定县|