官术网_书友最值得收藏!

Supervised learning

Supervised learning is the simplest and most well-known automatic learning task. It is based on a number of predefined examples, in which the category to which each of the inputs should belong is already known, as shown in the following diagram:

The preceding diagram shows a typical workflow of supervised learning. An actor (for example, a data scientist or data engineer) performs Extraction Transformation Load (ETL) and the necessary feature engineering (including feature extraction, selection, and so on) to get the appropriate data with features and labels so that they can be fed in to the model. Then he would split the data into training, development, and test sets. The training set is used to train an ML model, the validation set is used to validate the training against the overfitting problem and regularization, and then the actor would evaluate the model's performance on the test set (that is, unseen data).

However, if the performance is not satisfactory, he can perform additional tuning to get the best model based on hyperparameter optimization. Finally, he would deploy the best model in a production-ready environment. The following diagram summarizes these steps in a nutshell:

In the overall life cycle, there might be many actors involved (for example, a data engineer, data scientist, or an ML engineer) to perform each step independently or collaboratively. The supervised learning context includes classification and regression tasks; classification is used to predict which class a data point is a part of (discrete value). It is also used for predicting the label of the class attribute. On the other hand, regression is used for predicting continuous values and making a numeric prediction of the class attribute.

In the context of supervised learning, the learning process required for the input dataset is split randomly into three sets, for example, 60% for the training set, 10% for the validation set, and the remaining 30% for the testing set.

主站蜘蛛池模板: 保山市| 七台河市| 云梦县| 横峰县| 平和县| 江津市| 平果县| 瑞昌市| 南陵县| 子洲县| 姚安县| 雷波县| 兴业县| 台前县| 曲阳县| 西昌市| 洪泽县| 隆尧县| 新巴尔虎左旗| 南丰县| 皋兰县| 沈阳市| 汪清县| 茌平县| 巴东县| 海口市| 岑巩县| 万荣县| 德惠市| 乐亭县| 平山县| 天长市| 舒兰市| 甘德县| 淄博市| 六盘水市| 井冈山市| 双流县| 丽江市| 健康| 塘沽区|