官术网_书友最值得收藏!

HTTP requests

In a typical communication process on the web, HTML texts are the data that is to be saved and/or further processed. This data needs to be first collected from web pages, but how can we go about doing that? Most of the communication is done via the internet—more specifically, the World Wide Web—and this utilizes the Hypertext Transfer Protocol (HTTP). In HTTP, request methods are used to convey the information of what data is being requested and should be sent back from a server.

For example, when you type packtpub.com in your browser, the browser sends a request method via HTTP to the Packt website's main server asking for data from the website. Now, if both your internet connection and Packt's server are working well, then your browser will receive a response back from the server, as shown in the following diagram. This response will be in the form of an HTML document, which will be interpreted by your browser, and your browser will display the corresponding HTML output to the screen.

Diagram of HTTP communication

Generally, request methods are defined as verbs that indicate the desired action to be performed while the HTTP client (web browsers) and the server communicate with each other: GET, HEAD, POST, PUT, DELETE, and so on. Of these methods, GET and POST are two of the most common request methods used in web-scraping applications; their function is described in the following list:

  • The GET method makes a request for a specific data from the server. This method only retrieves data and has no other effect on the server and its databases.
  • The POST method sends data in a specific form that is accepted by the server. This data could be, for example, a message to a bulletin board, mailing list, or a newsgroup; information to be submitted to a web form; or an item to be added to a database.

All general-purpose HTTP servers that we commonly see on the internet are actually required to implement at least the GET (and HEAD) method, while the POST method is considered optional.

主站蜘蛛池模板: 巢湖市| 遵义县| 房山区| 芮城县| 天峨县| 平谷区| 常熟市| 治县。| 修文县| 富源县| 安平县| 广灵县| 远安县| 区。| 峨边| 东乌珠穆沁旗| 无锡市| 红安县| 德保县| 资源县| 旺苍县| 思茅市| 瓮安县| 周口市| 武宁县| 桐乡市| 武宣县| 即墨市| 海伦市| 寻乌县| 商都县| 买车| 凤翔县| 连平县| 乐清市| 葵青区| 文山县| 鹤峰县| 丹东市| 台江县| 桂东县|