書名： Hands-On Deep Learning with Apache Spark
作者名： Guglielmo Iozzia
本章字數： 1154字
更新時間： 2021-07-02 13:34:24

Introducing DL

DL is a subset of ML that can solve particularly hard and large-scale problems in areas such as Natural Language Processing (NLP) and image classification. The expression DL is sometimes used in an interchangeable way with ML and AI, but both ML and DL are subsets of AI. AI is the broader concept that is implemented through ML. DL is a way of implementing ML, and involves neural network-based algorithms:

Figure 2.1

AI is considered the ability of a machine (it could be any computer-controlled device or robot) to perform tasks that are typically associated with humans. It was introduced in the 1950s, with the goal of reducing human interaction, thereby making the machine do all the work. This concept is mainly applied to the development of systems that typically require human intellectual processes and/or the ability to learn from past experiences.

ML is an approach that's used to implement AI. It is a field of computer science that gives computer systems the ability to learn from data without being explicitly programmed. Basically, it uses algorithms to find patterns in data and then uses a model that recognizes those patterns to make predictions on new data. The following diagram shows the typical process that's used to train and build a model:

Figure 2.2

ML can be classified into three types:

Supervised learning algorithms, which use labeled data
Unsupervised learning algorithms, which find patterns, starting from unlabeled data
Semi-supervised learning, which uses a mix of the two (labeled and unlabeled data)

At the time of writing, supervised learning is the most common type of ML algorithm. Supervised learning can be divided into two groups – regression and classification problems.

The following graph shows a simple regression problem:

Figure 2.3

As you can see, there are two inputs (or features), Size and Price, which are used to generate a curve-fitting line and make subsequent predictions of the property price.

The following graph shows an example of supervised classification:

Figure 2.4

The dataset is labeled with benign (circles) and malignant (crosses) tumors for breast cancer patients. A supervised classification algorithm attempts, by fitting a line through the data, to part the tumors into two different classifications. Future data would then be classified as benign or malignant based on that straight-line classification. The case in the preceding graph has only two discrete outputs, but there are cases where there could be more than two classifications as well

While in supervised learning, labeled datasets help the algorithm determine what the correct answer is, in unsupervised learning, an algorithm is provided with an unlabeled dataset and depends on the algorithm itself to uncover structures and patterns in the data. In the following graphs (the graph on the right can be found at https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/Images/supervised_unsupervised.png), no information is provided about the meaning of each data point. We ask the algorithm to find a structure in the data in a way that is independent of supervision. An unsupervised learning algorithm could find that there are two distinct clusters and then perform straight-line classification between them:

Figure 2.5

DL is the name for multilayered neural networks, which are networks that are composed of several hidden layers of nodes between the input and output. DL is a refinement of Artificial Neural Networks (ANNs), which emulate how the human brain learns (even if not closely) and how it solves problems. ANNs consist of an interconnected group of neurons, similar to the way neurons work in the human brain. The following diagram represents the general model of ANNs:

Figure 2.6

A neuron is the atomic unit of an ANN. It receives a given number of input (x_i) before executing computation on it and finally sends the output to other neurons in the same network. The weights (w_j), or parameters, represent the strength of the input connection – they can assume positive or negative values. The net input can be calculated as follows:

y_in = x₁ X w₁ + x₂ X w₂ + x₃ X w₃ + … + x_n X w_n

The output can be calculated by applying the activation function over the net input:

y = f(y_in)

The activation function allows an ANN to model complex non-linear patterns that simpler models may not represent correctly.

The following diagram represents a neural network:

Figure 2.7

The first layer is the input layer – this is where features are put into the network. The last one is the output layer. Any layer in between that is not an input or output layer is a hidden layer. The term DL is used because of the multiple levels of hidden layers in neural networks that are used to resolve complex non-linear problems. At each layer level, any single node receives input data and a weight, and will then output a confidence score to the nodes of the next layer. This process happens until the output layer is reached. The error of the score is calculated on that layer. The errors are then sent back and the weights of the network are adjusted to improve the model (this is called backpropagation and happens inside a process called gradient descent, which we will discuss in Chapter 6, Recurrent Neural Networks). There are many variations of neural networks – more about them in the next section.

Before moving on, a final observation. You're probably wondering why most of the concepts behind AI, ML, and DL have been around for decades, but have only been hyped up in the past 4 or 5 years. There are several factors that accelerated their implementation and made it possible to move them from theory to real-world applications:

Cheaper computation: In the last few decades, hardware has been a constraining factor for AI/ML/DL. Recent advances in both hardware (coupled with improved tools and software frameworks) and new computational models (including those around GPUs) have accelerated AI/ML/DL adoption.
Greater data availability: AI/ML/DL needs a huge amount of data to learn. The digital transformation of society is providing tons of raw material to move forward quickly. Big data now comes from diverse sources such as IoT sensors, social and mobile computing, smart cars, healthcare devices, and many others that are or will be used to train models.
Cheaper storage: The increased amount of available data means that more space is needed for storage. Advances in hardware, cost reduction, and improved performance have made the implementation of new storage systems possible, all without the typical limitations of relational databases.
More advanced algorithms: Less expensive computation and storage enable the development and training of more advanced algorithms that also have impressive accuracy when solving specific problems such as image classification and fraud detection.
More, and bigger, investments: Last but not least, investment in AI is no longer confined to universities or research institutes, but comes from many other entities, such as tech giants, governments, start-ups, and large enterprises across almost every business area.

官术网_书友最值得收藏!

Hands-On Deep Learning with Apache Spark

Introducing DL