What this book covers

Chapter 1, Introduction to Web Scraping, introduces what is web scraping and how to crawl a website.

Chapter 2, Scraping the Data, shows you how to extract data from webpages using several libraries.

Chapter 3, Caching Downloads, teaches how to avoid re downloading by caching results.

Chapter 4, Concurrent Downloading, helps you how to scrape data faster by downloading websites in parallel.

Chapter 5, Dynamic Content, learn about how to extract data from dynamic websites through several means.

Chapter 6, Interacting with Forms, shows how to work with forms such as inputs and navigation for search and login.

Chapter 7, Solving CAPTCHA, elaborates how to access data protected by CAPTCHA images.

Chapter 8, Scrapy, learn how to use Scrapy crawling spiders for fast and parallelized scraping and the Portia web interface to build a web scraper.

Chapter 9, Putting It All Together, an overview of web scraping techniques you have learned via this book.

官术网_书友最值得收藏!