官术网_书友最值得收藏!

  • Python Data Science Essentials
  • Alberto Boschetti Luca Massaron
  • 219字
  • 2021-08-13 15:19:37

LIBSVM data examples

LIBSVM Data (http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/) is a page that gathers data from many other collections. It is maintained by Chih-Jen Lin, one of the authors of LIBSVM, a support vector machines learning algorithm for predictions (Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011). This offers different regression, binary, and multilabel classification datasets that are stored in the LIBSVM format. This repository is quite interesting if you wish to experiment with the support vector machine's algorithm, and, again, it is free for you to download and use the data.

If you want to load a dataset, first go to the web page where you can visualize the data on your browser. In the case of our example, visit http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a1a and note down the address (a1a is a dataset that's originally from the UC Irvine Machine Learning Repository, another open source data repository). Then, you can proceed by performing a direct download using that address:

In: import urllib2
url =
'http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a1a'
a2a = urllib2.urlopen(url)

In: from sklearn.datasets import load_svmlight_file
X_train, y_train = load_svmlight_file(a2a)
print (X_train.shape, y_train.shape)

Out: (1605, 119) (1605,)

In return, you will get two single objects: a set of training examples in a sparse matrix format and an array of responses.

主站蜘蛛池模板: 新乡市| 江门市| 潍坊市| 资源县| 江川县| 来宾市| 江永县| 余姚市| 长白| 普格县| 新宁县| 嫩江县| 积石山| 札达县| 东明县| 鄢陵县| 育儿| 瑞丽市| 天长市| 清河县| 临泽县| 东光县| 应城市| 呼图壁县| 青浦区| 凭祥市| 万盛区| 贵南县| 新巴尔虎左旗| 得荣县| 杭锦后旗| 沅江市| 北碚区| 嘉黎县| 定结县| 全南县| 田东县| 大关县| 叙永县| 涟源市| 池州市|