官术网_书友最值得收藏!

What you need for this book

All the code used in this book has been tested with Python 2.7, and is available for download at http://bitbucket.org/wswp/code. Ideally, in a future version of this book, the examples will be ported to Python 3. However, for now, many of the libraries required (such as Scrapy/Twisted, Mechanize, and Ghost) are only available for Python 2. To help illustrate the crawling examples, we created a sample website at http://example.webscraping.com. This website limits how fast you can download content, so if you prefer to host this yourself the source code and installation instructions are available at http://bitbucket.org/wswp/places.

We decided to build a custom website for many of the examples used in this book instead of scraping live websites, so that we have full control over the environment. This provides us stability—live websites are updated more often than books, and by the time you try a scraping example, it may no longer work. Also, a custom website allows us to craft examples that illustrate specific skills and avoid distractions. Finally, a live website might not appreciate us using them to learn about web scraping and try to block our scrapers. Using our own custom website avoids these risks; however, the skills learnt in these examples can certainly still be applied to live websites.

主站蜘蛛池模板: 绥芬河市| 健康| 来凤县| 宜州市| 五常市| 紫云| 永胜县| 界首市| 晋江市| 泾阳县| 沁源县| 英德市| 涟水县| 重庆市| 乡城县| 屏东市| 莎车县| 北流市| 湖南省| 紫金县| 丹凤县| 迭部县| 盐边县| 阿拉尔市| 淳安县| 长岭县| 瑞金市| 阳山县| 龙口市| 武陟县| 寻乌县| 常熟市| 合作市| 济源市| 万载县| 桐乡市| 麟游县| 宁津县| 仙居县| 白朗县| 郯城县|