官术网_书友最值得收藏!

Getting started

Before we get started with discussing the process of tidying data, it would be very prudent to point out that whatever you do to tidy your data, you should be sure to:

  1. Create and save your scripts so that you can use them again for new or similar data sources. This is referred to as reusability. Why spend time recreating the same code, rules, or logic if you don't have to? This applies to new data within the same project (that the scripts were developed for) or new projects you may be involved with in the future.
  2. Tidy your data as "far upstream" as possible, perhaps even at the original source. In other words, save and maintain the original data, but use programmatic scripts to clean it, fix mistakes, and save that cleaned dataset for further analysis.
主站蜘蛛池模板: 商河县| 南木林县| 林周县| 紫金县| 玉屏| 卓资县| 嘉禾县| 阿勒泰市| 泗水县| 高雄县| 延吉市| 霍山县| 新田县| 平阴县| 平顺县| 布尔津县| 洛南县| 德保县| 南澳县| 长垣县| 县级市| 河北省| 澳门| 饶平县| 浮梁县| 商南县| 青浦区| 图木舒克市| 米易县| 芜湖县| 夹江县| 白银市| 盘锦市| 苗栗县| 克什克腾旗| 大庆市| 纳雍县| 宜兰市| 哈巴河县| 尉犁县| 收藏|