官术网_书友最值得收藏!

Building machine learning models step by step

When developing an application that uses machine learning, we will follow a procedure characterized by the following steps:

  • Collecting the data: Everything starts from the data, no doubt about it; but one might wonder from where so much data comes. In practice, it is collected through lengthy procedures that may, for example, derive from measurement campaigns or face-to-face interviews. In all cases, the data is collected in a database so that it can then be analyzed to derive knowledge.
  • Preparing the data: We have collected the data; now, we have to prepare it for the next step. Once we have this data, we must make sure it is in a format usable by the algorithm we want to use. To do this, you may need to do some formatting. Recall that some algorithms need data in an integer format, whereas others require data in the form of strings, and finally others need it to be in a special format. We will get to this later, but the specific formatting is usually simple compared to the data collection.
  • Exploring the data: At this point, we can look at data to verify that it is actually working and that we do not have a bunch of empty values. In this step, through the use of plots, we can recognize patterns and whether or not there are some data points that are vastly different from the rest of the set. Plotting data in one, two, or three dimensions can also help.
  • Training the algorithm: Now, let's get serious. In this step, the machine learning algorithm works on the definition of the model and therefore deals with the training. The model starts to extract knowledge from the large amounts of data that we had available, and from which nothing has been explained so far. For unsupervised learning, there's no training step because you don't have a target value.
  • Testing the algorithm: In this step, we use the information learned in the previous step to see if the model actually works. The evaluation of an algorithm is for seeing how well the model approximates the real system. In the case of supervised learning, we have some known values that we can use to evaluate the algorithm. In unsupervised learning, we may need to use some other metrics to evaluate success. In both cases, if we are not satisfied, we can return to the previous steps, change some things, and retry the test.

  • Evaluating the algorithm: We have reached the point where we can apply what has been done so far. We can assess the approximation ability of the model by applying it to real data. The model, previously trained and tested, is then valued in this phase.
  • Improving algorithm performance: Finally, we can focus on the finishing steps. We've verified that the model works, we have evaluated the performance, and now we are ready to analyze the whole process to identify any possible room for improvement.
Before applying the machine learning algorithm to our data, it is appropriate to devote some time to the workflow setting.
主站蜘蛛池模板: 赞皇县| 漾濞| 九寨沟县| 扶余县| 永吉县| 阜城县| 卓尼县| 潜江市| 微博| 德兴市| 晋城| 临澧县| 秀山| 绍兴县| 安国市| 孟州市| 楚雄市| 舒兰市| 安多县| 万载县| 台东县| 玉环县| 子长县| 西华县| 涟水县| 晋中市| 邯郸市| 通州区| 墨脱县| 中超| 长子县| 涟水县| 崇礼县| 黄龙县| 东莞市| 柘荣县| 呼玛县| 抚顺县| 峨山| 郸城县| 敖汉旗|