- Python Web Scraping Cookbook
- Michael Heydt
- 257字
- 2021-06-30 18:43:57
How it works...
We will dive into details of both Requests and Beautiful Soup in the next chapter, but for now let's just summarize a few key points about how this works. The following important points about Requests:
- Requests is used to execute HTTP requests. We used it to make a GET verb request of the URL for the events page.
- The Requests object holds the results of the request. This is not only the page content, but also many other items about the result such as HTTP status codes and headers.
- Requests is used only to get the page, it does not do an parsing.
We use Beautiful Soup to do the parsing of the HTML and also the finding of content within the HTML.
To understand how this worked, the content of the page has the following HTML to start the Upcoming Events section:

We used the power of Beautiful Soup to:
- Find the <ul> element representing the section, which is found by looking for a <ul> with the a class attribute that has a value of list-recent-events.
- From that object, we find all the <li> elements.
Each of these <li> tags represent a different event. We iterate over each of those making a dictionary from the event data found in child HTML tags:
- The name is extracted from the <a> tag that is a child of the <h3> tag
- The location is the text content of the <span> with a class of event-location
- And the time is extracted from the datetime attribute of the <time> tag.
推薦閱讀
- 社交網(wǎng)絡(luò)對(duì)齊
- Web Application Development with R Using Shiny
- 計(jì)算機(jī)網(wǎng)絡(luò)與數(shù)據(jù)通信
- 面向物聯(lián)網(wǎng)的嵌入式系統(tǒng)開(kāi)發(fā):基于CC2530和STM32微處理器
- WordPress Web Application Development
- TCP/IP基礎(chǔ)(第2版)
- 現(xiàn)代通信系統(tǒng)(第5版)
- Selenium WebDriver 3 Practical Guide
- 物聯(lián)網(wǎng)基礎(chǔ)及應(yīng)用
- Building RESTful Web Services with .NET Core
- Guide to NoSQL with Azure Cosmos DB
- Enterprise ApplicationDevelopment with Ext JSand Spring
- 通信系統(tǒng)實(shí)戰(zhàn)筆記:無(wú)處不在的信號(hào)處理
- 國(guó)外物聯(lián)網(wǎng)透視
- 物聯(lián)網(wǎng)與無(wú)線傳感器網(wǎng)絡(luò)(第2版)