官术网_书友最值得收藏!

Data Munging

We are just getting into the action with data! In this chapter, you'll learn how to munge data. What does data munging mean ?

The term mung is a technical term that was coined about half a century ago by students of at Massachusetts Institute of Technology (MIT). Munging means to change, in a series of well-specified and reversible steps, a piece of original data to a completely different (and hopefully more useful) one. Deep-rooted in hacker culture, munging is often described in the data science pipeline using other, almost synonymous, terms such as data wrangling or data preparation.

Given such premises, in this chapter, the following topics will be covered:

  • The data science process (so that you'll know what is going on and what's next)
  • Uploading data from a file
  • Selecting the data you need
  • Cleaning up any missing or wrong data
  • Adding, inserting, and deleting data
  • Grouping and transforming data to obtain new and meaningful information
  • Managing to obtain a dataset matrix or an array to feed into the data science pipeline
主站蜘蛛池模板: 万宁市| 沙雅县| 芦溪县| 调兵山市| 柯坪县| 冀州市| 许昌县| 尼木县| 甘孜| 交口县| 张家港市| 望江县| 石棉县| 开阳县| 措勤县| 正安县| 巴东县| 正阳县| 连云港市| 民勤县| 东丽区| 通州区| 迭部县| 遂溪县| 西华县| 金寨县| 来安县| 修文县| 运城市| 景洪市| 郯城县| 和静县| 涟水县| 泾川县| 宜兰县| 蒙阴县| 铜鼓县| 嵩明县| 辉南县| 贺州市| 永昌县|