- Python Web Scraping(Second Edition)
- Katharine Jarmul Richard Lawson
- 144字
- 2021-07-09 19:42:47
Scraping the Data
In the previous chapter, we built a crawler which follows links to download the web pages we want. This is interesting but not useful-the crawler downloads a web page, and then discards the result. Now, we need to make this crawler achieve something by extracting data from each web page, which is known as scraping.
We will first cover browser tools to examine a web page, which you may already be familiar with if you have a web development background. Then, we will walk through three approaches to extract data from a web page using regular expressions, Beautiful Soup and lxml. Finally, the chapter will conclude with a comparison of these three scraping alternatives.
In this chapter, we will cover the following topics:
- Analyzing a web page
- Approaches to scrape a web page
- Using the console
- xpath selectors
- Scraping results
推薦閱讀
- ReSharper Essentials
- 少年輕松趣編程:用Scratch創作自己的小游戲
- MongoDB for Java Developers
- 數據庫系統原理及MySQL應用教程
- 算法大爆炸:面試通關步步為營
- Designing Hyper-V Solutions
- Hands-On C++ Game Animation Programming
- 基于Swift語言的iOS App 商業實戰教程
- Learning OpenStack Networking(Neutron)(Second Edition)
- 區塊鏈技術與應用
- Webpack實戰:入門、進階與調優
- Web性能實戰
- Practical Microservices
- Python一行流:像專家一樣寫代碼
- Flask開發Web搜索引擎入門與實戰