官术网_书友最值得收藏!

The basics of web requests

The worldwide capacity to generate data is estimated to double in size every two years. Even though there is an interdisciplinary field known as data science that is entirely dedicated to the study of data, almost every programming task in software development also has something to do with collecting and analyzing data. A significant part of this is, of course, data collection. However, the data that we need for our applications is sometimes not stored nicely and cleanly in a database—sometimes, we need to collect the data we need from web pages.

For example, web scraping is a data extraction method that automatically makes requests to web pages and downloads specific information. Web scraping allows us to comb through numerous websites and collect any data we need in a systematic and consistent manner—the collected data can be analyzed later on by our applications or simply saved on our computers in various formats. An example of this would be Google, which programs and runs numerous web scrapers of its own to find and index web pages for the search engine.

The Python language itself provides a number of good options for applications of this kind. In this chapter, we will mainly work with the requests module to make client-side web requests from our Python programs. However, before we look into this module in more detail, we need to understand some web terminology in order to be able to effectively design our applications.

主站蜘蛛池模板: 华池县| 朔州市| 额尔古纳市| 平昌县| 马山县| 益阳市| 南阳市| 都昌县| 阳泉市| 六枝特区| 三亚市| 鄂尔多斯市| 甘南县| 汉源县| 新巴尔虎右旗| 原平市| 登封市| 灵武市| 靖江市| 广汉市| 西盟| 永登县| 龙岩市| 长沙市| 永昌县| 灵武市| 奇台县| 临沧市| 福泉市| 天全县| 盐亭县| 菏泽市| 横峰县| 苍南县| 五莲县| 图们市| 泸水县| 济南市| 盘山县| 大竹县| 湖北省|