官术网_书友最值得收藏!

Summary

In this chapter we explored the common datasources and implemented a web scraping example. Next, we introduced the basic concepts of data scrubbing such as statistical methods and text parsing. Then we learned about how to parse the most used text formats with Python. Finally, we presented an introduction to OpenRefine which is an excellent tool for data cleansing and data formatting. Working with data is not just code or clicks, we also need to play with the data and follow our intuition to get our data in great shape. We need to get involved in the knowledge domain of our data to find inconsistencies. Global vision of data helps us to discover what we need to know about our data.

In the next chapter, we will explore our data through some visualization techniques and we will present a fast introduction to D3js.

主站蜘蛛池模板: 乳源| 新沂市| 寻乌县| 始兴县| 哈尔滨市| 江油市| 松江区| 东海县| 武穴市| 湘潭县| 区。| 左权县| 平邑县| 静安区| 乐清市| 新巴尔虎右旗| 彭山县| 汤阴县| 集安市| 集贤县| 武山县| 改则县| 宜黄县| 从化市| 景泰县| 伊春市| 武宁县| 正安县| 东阿县| 宜川县| 兖州市| 台南县| 平阳县| 安远县| 桂阳县| 景泰县| 鄂尔多斯市| 安仁县| 珠海市| 涡阳县| 大埔县|