官术网_书友最值得收藏!

Introduction

In the previous chapter, you saw how to build a binary classifier using the famous Logistic Regression algorithm. A binary classifier can only take two different values for its response variables, such as 0 and 1 or yes and no. A multiclass classification task is just an extension. Its response variable can have more than two different values.

In the data science industry, quite often you will face multiclass classification problems. For example, if you were working for Netflix or any other streaming platform, you would have to build a model that could predict the user rating for a movie based on key attributes such as genre, duration, or cast. A potential list of rating values may be: Hate it, Dislike it, Neutral, Like it, Love it. The objective of the model would be to predict the right rating from those five possible values.

Multiclass classification doesn't always mean the response variable will be text. In some datasets, the target variable may be encoded into a numerical form. Taking the same example as discussed, the rating may be coded from 1 to 5: 1 for Hate it, 2 for Dislike it, 3 for Neutral, and so on. So, it is important to understand the meaning of this response variable first before jumping to the conclusion that this is a regression problem.

In the next section, we will be looking at training our first Random Forest classifier.

主站蜘蛛池模板: 黔江区| 兴安盟| 沙田区| 应城市| 大安市| 阿瓦提县| 葵青区| 汝阳县| 桃园市| 河源市| 肇州县| 三亚市| 贵德县| 闵行区| 县级市| 正蓝旗| 阿拉善盟| 新营市| 稷山县| 文成县| 蓬莱市| 同仁县| 堆龙德庆县| 绥棱县| 金湖县| 文安县| 漾濞| 平陆县| 香港 | 洪洞县| 阿克陶县| 察隅县| 息烽县| 简阳市| 方城县| 兴城市| 德钦县| 邵阳市| 闻喜县| 桂阳县| 公主岭市|