官术网_书友最值得收藏!

Unsupervised machine learning

Unsupervised learning is a type of machine learning algorithm used for grouping related data objects and finding hidden patterns by inferencing from unlabeled datasets—that is, training sets consisting of input data without labels.

Let's see a real-life example. Suppose you have a large collection of non-pirated and totally legal MP3 files in a crowded and massive folder on your hard drive. Now, what if you could build a predictive model that helps you automatically group together similar songs and organize them into your favorite categories, such as country, rap, and rock?

This is an act of assigning an item to a group so that an MP3 is added to the respective playlist in an unsupervised way. For classification, we assume that you are given a training dataset of correctly labeled data. Unfortunately, we do not always have that luxury when we collect data in the real world.

For example, suppose we would like to divide a huge collection of music into interesting playlists. How can we possibly group together songs if we do not have direct access to their metadata? One possible approach is a mixture of various ML techniques, but clustering is often at the heart of the solution:

Figure 7: Clustering data samples at a glance

In other words, the main objective of unsupervised learning algorithms is to explore unknown/hidden patterns in input data that is unlabeled. Unsupervised learning, however, also comprehends other techniques to explain the key features of the data in an exploratory way to find the hidden patterns. To overcome this challenge, clustering techniques are used widely to group unlabeled data points based on certain similarity measures in an unsupervised way.

主站蜘蛛池模板: 阿克| 绥芬河市| 高陵县| 朝阳市| 页游| 筠连县| 平舆县| 卢湾区| 合作市| 天峻县| 商城县| 卫辉市| 垣曲县| 彰化市| 黄陵县| 汕头市| 卢湾区| 西乡县| 屯留县| 扶沟县| 绥阳县| 长泰县| 丹寨县| 凌海市| 江北区| 霞浦县| 东阿县| 荃湾区| 邢台市| 西乌珠穆沁旗| 永嘉县| 桃江县| 南溪县| 保康县| 夹江县| 会同县| 绥棱县| 津南区| 彭阳县| 石泉县| 五家渠市|