官术网_书友最值得收藏!

RL, supervised learning, and unsupervised learning

What is the difference between RL, supervised learning, and unsupervised learning? Well, all of them involve developing rules about an unknown environment using labeled or unlabeled data. The following is a simple diagram charting the different terms:

Take a look at the following definitions of each term: 

  • Supervised learning feeds labeled training data into an algorithm, trains the algorithm on that data, generates predictions for unlabeled testing data, and then compares the predictions of the model to the actual labels. The goal of supervised learning is to generate class labels for unseen data or to predict unseen numerical values using regression.
  • Unsupervised learning looks for similarities between different observations of unlabeled data. An unsupervised learning algorithm looks for observations that fit together along axes of similarity. The goal of unsupervised learning is to group together similar observations based on relevant criteria. 
  • RL seeks to optimize a variable under a set of constraints. An RL algorithm, called an agent, is seeking an optimal path to a goal. Therefore, the goal of RL is to find a set of actions, mapped to a set of states, that leads us to the best possible outcome in a situation that we have limited information about. 

The primary difference between these three learning methods is in the type of question being asked: 

  • Supervised learning works well for classification and regression problems (for example, whether a customer will buy a product or how much they might spend)
  • Unsupervised learning works well for problems dealing with association (for example, what products customers might buy together) and anomaly detection
  • RL works best when there is a specific value to be optimized and a function that can be discovered within a problem to optimize it (for example, how can we maximize the number of times a user will click on links or download apps based on the advertisements that we show them)

Note that this list of uses for each method is not exhaustive; we are only presenting well-known examples of the type of problem each method tends to work well for.

There are many other examples of questions that we might ask and other machine learning algorithms that we might use to solve them, but understanding the broad similarities and differences between these three major types will be useful for us going forward. 

主站蜘蛛池模板: 彩票| 临海市| 榆社县| 本溪市| 庆云县| 柳江县| 荔浦县| 凤翔县| 巫山县| 阿拉尔市| 丹阳市| 康乐县| 离岛区| 华亭县| 鹤庆县| 保靖县| 苏尼特左旗| 富顺县| 茶陵县| 大安市| 西宁市| 景泰县| 苍梧县| 聂荣县| 邵武市| 阿荣旗| 黄陵县| 武夷山市| 沙田区| 平舆县| 南投县| 芦溪县| 青神县| 长岭县| 当阳市| 丽江市| 永清县| 监利县| 伊金霍洛旗| 玛纳斯县| 章丘市|