官术网_书友最值得收藏!

Choosing the right algorithm

In the previous section, we learned the difference between various types of machine learning algorithms. So, we understood the basic principles that underlie the different techniques. Now it's time to ask ourselves the following question: What is the right algorithm for my needs?
Unfortunately there is no common answer for everyone, except the more generic: It depends. But what does it depend on? It mainly depends on the data available to us: the size, quality, and nature of the data. It depends on what we want to do with the answer. It depends on how the algorithm has been expressed in instructions for the computer. It depends on how much time we have. There is no best method or one-size-fits-all. The only way to be sure that the algorithm chosen is the right one is to try it.

However, to understand what is most suitable for our needs, we can perform a preliminary analysis. Beginning from what we have (data), what tools we have available (algorithms), and what objectives we set for ourselves (the results), we can obtain useful information on the road ahead.

If we start from what we have (data), it is a classification problem, and two options are available:

  • Classify based on input: We have a supervised learning problem if we can label the input data. If we cannot label the input data but want to find the structure of the system, then it is unsupervised. Finally, if our goal is to optimize an objective function by interacting with the environment, it is a reinforcement learning problem.
  • Classify based on output: If our model output is a number, we have to deal with a regression problem. But it is a classification problem if the output of the model is a class. Finally, we have a clustering problem if the output of the model is a set of input groups.

The following is a figure that shows two options available in the classification problem:


Figure 1.9: Preliminary analysis

After classifying the problem, we can analyze the tools available to solve the specific problem. Thus, we can identify the algorithms that are applicable and focus our study on the methods to be implemented to apply these tools to our problem.

Having identified the tools, we need to evaluate their performance. To do this, we simply apply the selected algorithms on the datasets at our disposal. Subsequently, on the basis of a series of carefully selected evaluation criteria, we carry out a comparison of the performance of each algorithm.

主站蜘蛛池模板: 班玛县| 仙桃市| 合肥市| 宁城县| 博乐市| 涿鹿县| 乌拉特前旗| 黄山市| 万全县| 江津市| 明溪县| 扎赉特旗| 昌江| 雅江县| 莱州市| 环江| 仙桃市| 孟村| 英德市| 峨山| 昌乐县| 永善县| 怀远县| 南江县| 霍州市| 宁国市| 新津县| 山阴县| 贵州省| 富平县| 鄂托克旗| 淅川县| 彭山县| 阿拉尔市| 桑植县| 梁河县| 读书| 沽源县| 山西省| 丽水市| 云和县|