官术网_书友最值得收藏!

Algorithm selection

We need to iterate on the complex problem of the creating the algorithm. This entails exploring the data to gain a deep understanding of the underlying variables. Once we have an idea of the kind of algorithm we want to apply, we'll need to further prepare the data, possibly combining it with other data sources (for example, census data). In our example, this could mean creating a song similarity matrix. Once we have the data, we can train a model so that it is capable of making predictions, and test that model against holdout data to see how it performs. There are many considerations in this process that make it complex:

  • How the data is encoded (for example, how the song matrix is constructed)
  • What algorithm is used (example, collaborative filtering or content-based filtering)
  • What parameter values your model takes (for example, values for smoothing constants or prior distributions)

Our goal in this book is to make this step easier for you by presenting iterations a data scientist would undergo in the task of creating a successful model using real-world applications as examples.

主站蜘蛛池模板: 古交市| 东阿县| 穆棱市| 松江区| 调兵山市| 芮城县| 纳雍县| 通州市| 共和县| 烟台市| 韶山市| 牙克石市| 称多县| 神池县| 鹰潭市| 富源县| 环江| 阿鲁科尔沁旗| 澳门| 汤阴县| 葫芦岛市| 镇江市| 惠水县| 洪洞县| 郎溪县| 石嘴山市| 望奎县| 灵川县| 云安县| 中宁县| 甘肃省| 井陉县| 宝山区| 社会| 栾城县| 庄河市| 金堂县| 大邑县| 绥宁县| 台安县| 迁安市|