官术网_书友最值得收藏!

Avoid making a large number of requests

Each time one of the programs that we have been discussing runs, it makes HTTP requests to a server that manages the site that you'd like to extract data from. This process happens significantly more frequently and over a shorter amount of time in a concurrent program, where multiple requests are being submitted to that server.

As mentioned before, servers nowadays have the ability to handle multiple requests simultaneously with ease. However, to avoid having to overwork and overconsume resources, servers are also designed to stop answering requests that come in too frequently. Websites of big tech companies, such as Amazon or Twitter, look for large amounts of automated requests that are made from the same IP address and implement different response protocols; some requests might be delayed, some might be refused a response, or the IP address might even be banned from making further requests for a specific amount of time.

Interestingly, making repeated, heavy-duty requests to servers is actually a form of hacking a website. In Denial of Service (DoS) and Distributed Denial of Service (DDoS) attacks, a very large number of requests are made at the same time to the server, flooding the bandwidth of the targeted server with traffic, and as a result, normal, nonmalicious requests from other clients are denied because the servers are busy processing the concurrent requests, as illustrated in the following diagram:

A of a DDoS attack

It is therefore important to space out the concurrent requests that your application makes to a server so that the application would not be considered an attacker and be potentially banned or treated as a malicious client. This could be as simple as limiting the maximum number of threads/requests that can be implemented at a time in your program or pausing the threading for a specific amount of time (for example, using the time.sleep() function) before making a request to the server.

主站蜘蛛池模板: 汝城县| 屏东县| 平和县| 榕江县| 宁阳县| 古蔺县| 中超| 新沂市| 新邵县| 平昌县| 湘西| 泰兴市| 济南市| 太白县| 林西县| 宁化县| 四川省| 通道| 东丰县| 洮南市| 都安| 平罗县| 宝清县| 廊坊市| 万全县| 江达县| 横山县| 金堂县| 抚州市| 克什克腾旗| 丹江口市| 顺昌县| 高邮市| 墨玉县| 含山县| 镇平县| 双城市| 裕民县| 修水县| 西乌| 嘉祥县|