官术网_书友最值得收藏!

Data gathering

We need to obtain data and organize it appropriately for the current problem (in our example, this could mean building a dataset linking users to songs they've listened to in the past). Depending on the size of the data, we might pick different technologies for storing the data. For example, it might be fine to train on a local machine using scikit-learn if we're working through a few million records. However, if the data doesn't fit on a single computer, then we must consider AWS solutions such as S3 for storage and Apache Spark, or SageMaker's built-in algorithms for model building.

主站蜘蛛池模板: 甘德县| 朝阳区| 东港市| 洛阳市| 奉化市| 尉犁县| 浑源县| 镶黄旗| 浪卡子县| 门源| 临湘市| 新泰市| 馆陶县| 常德市| 阿勒泰市| 水富县| 隆尧县| 遂溪县| 赞皇县| 永州市| 筠连县| 洪洞县| 伊金霍洛旗| 长治市| 房山区| 定兴县| 宁夏| 辽阳县| 武鸣县| 弋阳县| 连江县| 修文县| 双牌县| 井冈山市| 铁岭市| 怀集县| 京山县| 莱州市| 威宁| 克拉玛依市| 吉木乃县|