官术网_书友最值得收藏!

Summary

In this chapter we explored the common datasources and implemented a web scraping example. Next, we introduced the basic concepts of data scrubbing such as statistical methods and text parsing. Then we learned about how to parse the most used text formats with Python. Finally, we presented an introduction to OpenRefine which is an excellent tool for data cleansing and data formatting. Working with data is not just code or clicks, we also need to play with the data and follow our intuition to get our data in great shape. We need to get involved in the knowledge domain of our data to find inconsistencies. Global vision of data helps us to discover what we need to know about our data.

In the next chapter, we will explore our data through some visualization techniques and we will present a fast introduction to D3js.

主站蜘蛛池模板: 江孜县| 波密县| 锡林郭勒盟| 上思县| 嵩明县| 临湘市| 固镇县| 新民市| 博野县| 岳普湖县| 广南县| 临海市| 五大连池市| 方正县| 定边县| 垦利县| 洮南市| 万山特区| 正安县| 封开县| 金平| 山西省| 晴隆县| 财经| 中方县| 石景山区| 金沙县| 宾川县| 三江| 潞城市| 图们市| 云和县| 六安市| 宝丰县| 镇雄县| 宣汉县| 滕州市| 灌南县| 厦门市| 三都| 台前县|