官术网_书友最值得收藏!

Supervised learning algorithms

Supervised algorithms rely on human knowledge to complete their tasks. Let's say we have a dataset related to loan repayment that contains several demographic indicators, as well as whether a loan was paid back or not:

The Paid column, which tells us if a loan was paid back or not, is called the target - it's what we would like to predict. The data that contains information about the applicants background is known as the features of the datasets. In supervised learning, algorithms learn to predict the target based on the features, or in other words, what indicators give a high probability that an applicant will pay back a loan or not? Mathematically, this process looks as follows:

Here, we are saying that our label  is a function of the input features , plus some amount of error  that it caused naturally by the dataset. We know that a certain set of features will likely produce a certain outcome. In supervised learning, we set up an algorithm to learn what function will produce the correct mapping of a set of features to an outcome. 

To illustrate how supervised learning works, we are going to utilize a famous example toy dataset in the machine learning field, the Iris Dataset. It shows four features: Sepal Length, Sepal Width, Petal Length, and Petal Width. In this dataset, our target variable (sometimes called a label) is Name. The dataset is available in the GitHub repository that corresponds with this chapter:

import pandas as pd
data = pd.read_csv("iris.csv")
data.head()

The preceding code generates the following output:

Now that we have our data ready to go, let's jump into some supervised learning!

主站蜘蛛池模板: 阆中市| 扎兰屯市| 云南省| 神农架林区| 华容县| 博兴县| 沙田区| 固原市| 翁源县| 郸城县| 辽源市| 施秉县| 通化县| 盖州市| 乌兰浩特市| 温泉县| 常熟市| 泗阳县| 东平县| 五寨县| 措美县| 桓仁| 威宁| 玛纳斯县| 财经| 台东市| 陈巴尔虎旗| 教育| 武宣县| 湟中县| 北碚区| 安塞县| 新乡市| 龙里县| 周宁县| 隆尧县| 鹰潭市| 兴宁市| 铜川市| 会东县| 英德市|