官术网_书友最值得收藏!

LIBSVM data examples

LIBSVM Data (http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/) is a page that gathers data from many other collections. It is maintained by Chih-Jen Lin, one of the authors of LIBSVM, a support vector machines learning algorithm for predictions (Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011). This offers different regression, binary, and multilabel classification datasets that are stored in the LIBSVM format. This repository is quite interesting if you wish to experiment with the support vector machine's algorithm, and, again, it is free for you to download and use the data.

If you want to load a dataset, first go to the web page where you can visualize the data on your browser. In the case of our example, visit http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a1a and note down the address (a1a is a dataset that's originally from the UC Irvine Machine Learning Repository, another open source data repository). Then, you can proceed by performing a direct download using that address:

In: import urllib2
url =
'http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a1a'
a2a = urllib2.urlopen(url)

In: from sklearn.datasets import load_svmlight_file
X_train, y_train = load_svmlight_file(a2a)
print (X_train.shape, y_train.shape)

Out: (1605, 119) (1605,)

In return, you will get two single objects: a set of training examples in a sparse matrix format and an array of responses.

主站蜘蛛池模板: 乳山市| 女性| 庆安县| 井冈山市| 临清市| 鄂伦春自治旗| 桂林市| 孝义市| 偃师市| 保靖县| 永善县| 兴化市| 施秉县| 柯坪县| 容城县| 吉林市| 岳西县| 斗六市| 通渭县| 台中县| 台南市| 同江市| 绥中县| 探索| 兰州市| 苗栗市| 富平县| 昭平县| 封开县| 乌恰县| 汉沽区| 丰宁| 汝州市| 钟祥市| 扬州市| 巫山县| 河东区| 广平县| 霍山县| 邢台市| 左贡县|