官术网_书友最值得收藏!

Preparing to Train a Predictive Model

Here, we will cover the preparation required to train a predictive model. Although not as technically glamorous as training the models themselves, this step should not be taken lightly. It's very important to ensure you have a good plan before proceeding with the details of building and training a reliable model. Furthermore, once you've decided on the right plan, there are technical steps in preparing the data for modeling that should not be overlooked. 

We must be careful not to go so deep into the weeds of technical tasks that we lose sight of the goal. Technical tasks include things that require programming skills, for example, constructing visualizations, querying databases, and validating predictive models. It's easy to spend hours trying to implement a specific feature or get the plots looking just right. Doing this sort of thing is certainly beneficial to our programming skills, but we should not forget to ask ourselves if it's really worth our time with respect to the current project.

Also, keep in mind that Jupyter Notebooks are particularly well-suited for this step, as we can use them to document our plan, for example, by writing rough notes about the data or a list of models we are interested in training. Before starting to train models, it's good practice to even take this a step further and write out a well-structured plan to follow. Not only will this help you stay on track as you build and test the models, but it will allow others to understand what you're doing when they see your work.

After discussing the preparation, we will also cover another step in preparing to train the predictive model, which is cleaning the dataset. This is another thing that Jupyter Notebooks are well-suited for, as they offer an ideal testing ground for performing dataset transformations and keeping track of the exact changes. The data transformations required for cleaning raw data can quickly become intricate and convoluted; therefore, it's important to keep track of your work. As discussed in the first chapter, tools other than Jupyter Notebooks just don't offer very good options for doing this efficiently.

主站蜘蛛池模板: 海淀区| 宜昌市| 都江堰市| 威信县| 遂平县| 邯郸县| 政和县| 甘谷县| 阜宁县| 若尔盖县| 新沂市| 璧山县| 海安县| 奈曼旗| 青阳县| 南康市| 连江县| 孟州市| 南充市| 工布江达县| 西丰县| 醴陵市| 封丘县| 娄底市| 武穴市| 柘城县| 沂源县| 新郑市| 杂多县| 五家渠市| 图木舒克市| 忻城县| 伊川县| 麻阳| 色达县| 浦江县| 丹寨县| 青河县| 勃利县| 沁水县| 平乐县|