官术网_书友最值得收藏!

Algorithm selection

We need to iterate on the complex problem of the creating the algorithm. This entails exploring the data to gain a deep understanding of the underlying variables. Once we have an idea of the kind of algorithm we want to apply, we'll need to further prepare the data, possibly combining it with other data sources (for example, census data). In our example, this could mean creating a song similarity matrix. Once we have the data, we can train a model so that it is capable of making predictions, and test that model against holdout data to see how it performs. There are many considerations in this process that make it complex:

  • How the data is encoded (for example, how the song matrix is constructed)
  • What algorithm is used (example, collaborative filtering or content-based filtering)
  • What parameter values your model takes (for example, values for smoothing constants or prior distributions)

Our goal in this book is to make this step easier for you by presenting iterations a data scientist would undergo in the task of creating a successful model using real-world applications as examples.

主站蜘蛛池模板: 沙坪坝区| 闸北区| 浏阳市| 邢台县| 陈巴尔虎旗| 大姚县| 米林县| 剑川县| 河北省| 贵南县| 仁怀市| 济源市| 徐州市| 大化| 长丰县| 嘉兴市| 莱阳市| 和龙市| 革吉县| 蓬安县| 永泰县| 双峰县| 昭觉县| 华安县| 富锦市| 广西| 连平县| 宁晋县| 陆丰市| 鲁甸县| 德清县| 邹城市| 青海省| 容城县| 商河县| 顺昌县| 新昌县| 海林市| 北安市| 长寿区| 公主岭市|