書名： Mastering Machine Learning on AWS
作者名： Dr. Saket S.R. Mengle Maximo Gurmendez
本章字數： 103字
更新時間： 2021-06-24 14:23:11

Data gathering

We need to obtain data and organize it appropriately for the current problem (in our example, this could mean building a dataset linking users to songs they've listened to in the past). Depending on the size of the data, we might pick different technologies for storing the data. For example, it might be fine to train on a local machine using scikit-learn if we're working through a few million records. However, if the data doesn't fit on a single computer, then we must consider AWS solutions such as S3 for storage and Apache Spark, or SageMaker's built-in algorithms for model building.

官术网_书友最值得收藏!

Mastering Machine Learning on AWS

Data gathering