官术网_书友最值得收藏!

Unsupervised machine learning

Unsupervised learning is a type of machine learning algorithm used for grouping related data objects and finding hidden patterns by inferencing from unlabeled datasets—that is, training sets consisting of input data without labels.

Let's see a real-life example. Suppose you have a large collection of non-pirated and totally legal MP3 files in a crowded and massive folder on your hard drive. Now, what if you could build a predictive model that helps you automatically group together similar songs and organize them into your favorite categories, such as country, rap, and rock?

This is an act of assigning an item to a group so that an MP3 is added to the respective playlist in an unsupervised way. For classification, we assume that you are given a training dataset of correctly labeled data. Unfortunately, we do not always have that luxury when we collect data in the real world.

For example, suppose we would like to divide a huge collection of music into interesting playlists. How can we possibly group together songs if we do not have direct access to their metadata? One possible approach is a mixture of various ML techniques, but clustering is often at the heart of the solution:

Figure 7: Clustering data samples at a glance

In other words, the main objective of unsupervised learning algorithms is to explore unknown/hidden patterns in input data that is unlabeled. Unsupervised learning, however, also comprehends other techniques to explain the key features of the data in an exploratory way to find the hidden patterns. To overcome this challenge, clustering techniques are used widely to group unlabeled data points based on certain similarity measures in an unsupervised way.

主站蜘蛛池模板: 乐都县| 望谟县| 邵武市| 花垣县| 玉屏| 峡江县| 仁怀市| 永济市| 陇川县| 页游| 会同县| 惠水县| 江达县| 永平县| 晴隆县| 柞水县| 屯留县| 高碑店市| 云和县| 义马市| 漳浦县| 庄河市| 孙吴县| 阿鲁科尔沁旗| 萨嘎县| 犍为县| 宣恩县| 乌拉特前旗| 崇礼县| 永丰县| 崇礼县| 巴中市| 丹阳市| 和林格尔县| 巴林左旗| 清涧县| 满城县| 吉首市| 永登县| 高邑县| 武冈市|