書名： Python Deep Learning
作者名： Ivan Vasilev Daniel Slater Gianmario Spacagna Peter Roelants Valentino Zocca
本章字數： 789字
更新時間： 2021-07-02 14:31:00

Supervised learning

Supervised learning algorithms are a class of machine learning algorithms that use previously-labeled data to learn its features, so they can classify similar but unlabeled data. Let's use an example to understand this concept better.

Let's assume that a user receives a large amount of emails every day, some of which are important business emails and some of which are unsolicited junk emails, also known as spam. A supervised machine algorithm will be presented with a large body of emails that have already been labeled by a teacher as spam or not spam (this is called training data). For each sample, the machine will try to predict whether the email is spam or not, and it will compare the prediction with the original target label. If the prediction differs from the target, the machine will adjust its internal parameters in such a way that the next time it encounters this sample it will classify it correctly. Conversely, if the prediction was correct, the parameters will stay the same. The more training data we feed to the algorithm, the better it becomes (this rule has caveats, as we'll see next).

In the example we used, the emails had only two classes (spam or not spam), but the same principles apply for tasks with arbitrary numbers of classes. For example, we could train the software on a set of labeled emails where the classes are Personal, Business/Work, Social, or Spam.

In fact, Gmail, the free email service by Google, allows the user to select up to five categories, which are labeled as the following:

Primary: Includes person-to-person conversations
Social: Includes messages from social networks and media-sharing sites
Promotions: Includes marketing emails, offers, and discounts
Updates: Includes bills, bank statements, and receipts
Forums: Includes messages from online groups and mailing lists

In some cases, the outcome may not necessarily be discrete, and we may not have a finite number of classes to classify our data into. For example, we may try to predict the life expectancy of a group of people based on their predetermined health parameters. In this case, the outcome is a continuous function, that is, the number years the person is expected to live, and we don't talk about classification but rather regression.

One way to think of supervised learning is to imagine we are building a function, f, defined over a dataset, which comprises information organized by features. In the case of email classification, the features can be specific words that may appear more frequently than others in spam emails. The use of explicit sex-related words will most likely identify a spam email rather than a business/work email. On the contrary, words such as meeting, business, or presentation are more likely to describe a work email. If we have access to metadata, we may also use the sender's information as a feature. Each email will then have an associated set of features, and each feature will have a value (in this case, how many times the specific word is present in the email body). The machine learning algorithm will then seek to map those values to a discrete range that represents the set of classes, or a real value in the case of regression. The definition of the f function is as follows:

In later chapters, we'll see several examples of either classification or regression problems. One such problem we'll discuss is the classification of handwritten digits (the famous Modified National Institute of Standards and Technology, or MNIST, database). When given a set of images representing 0 to 9, the machine learning algorithm will try to classify each image in one of the 10 classes, wherein each class corresponds to one of the 10 digits. Each image is 28x28 (= 784) pixels in size. If we think of each pixel as one feature, then the algorithm will use a 784-dimensional feature space to classify the digits.

The following screenshot depicts the handwritten digits from the MNIST dataset:

Example of handwritten digits from the MNIST dataset

In the next sections, we'll talk about some of the most popular classical supervised algorithms. The following is by no means an exhaustive list or a thorough description of each machine learning method. We can refer to the book Python Machine Learning by Sebastian Raschka (https://www.packtpub.com/big-data-and-business-intelligence/python-machine-learning). It's a simple review meant to provide the reader with a flavor of the different techniques. Also, at the end of this chapter in the Neural networks section, we'll introduce neural networks and we'll talk about how deep learning differs from the classical machine learning techniques.

官术网_书友最值得收藏!

Python Deep Learning

Supervised learning