- Python Deep Learning
- Ivan Vasilev Daniel Slater Gianmario Spacagna Peter Roelants Valentino Zocca
- 496字
- 2021-07-02 14:31:01
Unsupervised learning
The second class of machine learning algorithms is unsupervised learning. Here, we don't label the data beforehand, but instead we let the algorithm come to its conclusion. One of the most common, and perhaps simplest, examples of unsupervised learning is clustering. This is a technique that attempts to separate the data into subsets.
To illustrate this, let's view the spam-or-not-spam email classification as an unsupervised learning problem. In the supervised case, for each email, we had a set of features and a label (spam or not spam). Here, we'll use the same set of features, but the emails will not be labeled. Instead, we'll ask the algorithm, when given the set of features, to put each sample in one of two separate groups (or clusters). Then the algorithm will try to combine the samples in such a way that the intraclass similarity (which is the similarity between samples in the same cluster) is high and the similarity between different clusters is low. Different clustering algorithms use different metrics to measure similarity. For some more advanced algorithms, you don't have to specify the number of clusters.
In the following graph, we show how a set of points can be classified to form three subsets:

Deep learning also uses unsupervised techniques, albeit different than clustering. In natural language processing (NLP), we use unsupervised (or semi-supervised, depending on who you ask) algorithms for vector representations of words. The most popular way to do this is called word2vec. For each word, we use its surrounding words (or its context) in the text and feed them to a simple neural network. The network produces a numerical vector, which contains a lot of information for the word (derived by the context). We then use these vectors instead of the words for various NLP tasks, such as sentiment analysis or machine translation. We’ll describe word2vec in Chapter 7, Recurrent Neural Networks and Language Models.
Another interesting application of unsupervised learning is in generative models, as opposed to discriminative models. We will train a generative model with a large amount of data of a certain domain, such as images or text, and the model will try to generate new data similar to the one we used for training. For example, a generative model can colorize black and white images, change facial expressions in images, or even synthesize images based on a text description. In Chapter 6, Generating Images with GANs and Variational Autoencoders, we'll look at two of the most popular generative techniques, Variational Autoencoders and Generative Adversarial Networks (GANs).
The following depicts the techniques:

In the preceding image, you can see how the authors of StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, and Dimitris Metaxas, use a combination of supervised learning and unsupervised GANs to produce photo-realistic images based on text descriptions.
- .NET之美:.NET關鍵技術深入解析
- C程序設計簡明教程(第二版)
- 樂學Web編程:網(wǎng)站制作不神秘
- Internet of Things with the Arduino Yún
- MATLAB實用教程
- Python貝葉斯分析(第2版)
- 軟件架構:Python語言實現(xiàn)
- Python漫游數(shù)學王國:高等數(shù)學、線性代數(shù)、數(shù)理統(tǒng)計及運籌學
- 琢石成器:Windows環(huán)境下32位匯編語言程序設計
- Oracle從入門到精通(第5版)
- Visual Basic程序設計教程
- Java圖像處理:基于OpenCV與JVM
- Zabbix Performance Tuning
- Serverless Web Applications with React and Firebase
- Android Development Tools for Eclipse