- Python Web Scraping(Second Edition)
- Katharine Jarmul Richard Lawson
- 144字
- 2021-07-09 19:42:47
Scraping the Data
In the previous chapter, we built a crawler which follows links to download the web pages we want. This is interesting but not useful-the crawler downloads a web page, and then discards the result. Now, we need to make this crawler achieve something by extracting data from each web page, which is known as scraping.
We will first cover browser tools to examine a web page, which you may already be familiar with if you have a web development background. Then, we will walk through three approaches to extract data from a web page using regular expressions, Beautiful Soup and lxml. Finally, the chapter will conclude with a comparison of these three scraping alternatives.
In this chapter, we will cover the following topics:
- Analyzing a web page
- Approaches to scrape a web page
- Using the console
- xpath selectors
- Scraping results
推薦閱讀
- Monkey Game Development:Beginner's Guide
- Hands-On Data Structures and Algorithms with JavaScript
- Mastering QGIS
- Learning Python Design Patterns(Second Edition)
- 深度學習:算法入門與Keras編程實踐
- C語言程序設計
- Web Development with MongoDB and Node(Third Edition)
- C專家編程
- 深入解析Java編譯器:源碼剖析與實例詳解
- 程序員必會的40種算法
- Spark技術內幕:深入解析Spark內核架構設計與實現原理
- Java程序設計教程
- Dart:Scalable Application Development
- Mobile Test Automation with Appium
- Building RESTful Web Services with PHP 7