官术网_书友最值得收藏!

Chapter 1. Preparing the Data

In this chapter, we will cover the basic tasks of reading, storing, and cleaning data using Python and OpenRefine. You will learn the following recipes:

  • Reading and writing CSV/TSV files with Python
  • Reading and writing JSON files with Python
  • Reading and writing Excel files with Python
  • Reading and writing XML files with Python
  • Retrieving HTML pages with pandas
  • Storing and retrieving from a relational database
  • Storing and retrieving from MongoDB
  • Opening and transforming data with OpenRefine
  • Exploring the data with OpenRefine
  • Removing duplicates
  • Using regular expressions and GREL to clean up the data
  • Imputing missing observations
  • Normalizing and standardizing features
  • Binning the observations
  • Encoding categorical variables
主站蜘蛛池模板: 呼伦贝尔市| 崇州市| 达州市| 惠州市| 磴口县| 光泽县| 莱阳市| 阳谷县| 元阳县| 会昌县| 武隆县| 临清市| 博野县| 上饶县| 铁岭市| 阳西县| 通城县| 温泉县| 邓州市| 聊城市| 秀山| 民丰县| 扎鲁特旗| 循化| 呼伦贝尔市| 富民县| 冀州市| 德庆县| 大石桥市| 石林| 法库县| 临海市| 闻喜县| 延安市| 行唐县| 剑川县| 蒙山县| 潮安县| 雷山县| 上虞市| 册亨县|