官术网_书友最值得收藏!

  • Python Deep Learning
  • Ivan Vasilev Daniel Slater Gianmario Spacagna Peter Roelants Valentino Zocca
  • 496字
  • 2021-07-02 14:31:01

Unsupervised learning

The second class of machine learning algorithms is unsupervised learning. Here, we don't label the data beforehand, but instead we let the algorithm come to its conclusion. One of the most common, and perhaps simplest, examples of unsupervised learning is clustering. This is a technique that attempts to separate the data into subsets.

To illustrate this, let's view the spam-or-not-spam email classification as an unsupervised learning problem. In the supervised case, for each email, we had a set of features and a label (spam or not spam). Here, we'll use the same set of features, but the emails will not be labeled. Instead, we'll ask the algorithm, when given the set of features, to put each sample in one of two separate groups (or clusters). Then the algorithm will try to combine the samples in such a way that the intraclass similarity (which is the similarity between samples in the same cluster) is high and the similarity between different clusters is low. Different clustering algorithms use different metrics to measure similarity. For some more advanced algorithms, you don't have to specify the number of clusters.

In the following graph, we show how a set of points can be classified to form three subsets:

Deep learning also uses unsupervised techniques, albeit different than clustering. In natural language processing (NLP), we use unsupervised (or semi-supervised, depending on who you ask) algorithms for vector representations of words. The most popular way to do this is called word2vec. For each word, we use its surrounding words (or its context) in the text and feed them to a simple neural network. The network produces a numerical vector, which contains a lot of information for the word (derived by the context). We then use these vectors instead of the words for various NLP tasks, such as sentiment analysis or machine translation. We’ll describe word2vec in Chapter 7Recurrent Neural Networks and Language Models.

Another interesting application of unsupervised learning is in generative models, as opposed to discriminative models. We will train a generative model with a large amount of data of a certain domain, such as images or text, and the model will try to generate new data similar to the one we used for training. For example, a generative model can colorize black and white images, change facial expressions in images, or even synthesize images based on a text description. In Chapter 6Generating Images with GANs and Variational Autoencoders, we'll look at two of the most popular generative techniques, Variational Autoencoders and Generative Adversarial Networks (GANs).

The following depicts the techniques:

In the preceding image, you can see how the authors of StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, and Dimitris Metaxas, use a combination of supervised learning and unsupervised GANs to produce photo-realistic images based on text descriptions.

主站蜘蛛池模板: 阳江市| 山东省| 河间市| 芜湖县| 青浦区| 祁连县| 凉山| 临西县| 石首市| 出国| 临颍县| 武清区| 钟山县| 临邑县| 从化市| 班戈县| 蚌埠市| 陆丰市| 田林县| 巫溪县| 浠水县| 宜州市| 吴堡县| 荃湾区| 邯郸市| 常宁市| 蒙山县| 南郑县| 玛曲县| 韩城市| 许昌县| 许昌县| 郓城县| 资源县| 东阳市| 青浦区| 屏东县| 西林县| 温州市| 安龙县| 临猗县|