官术网_书友最值得收藏!

What you need for this book

All the code used in this book has been tested with Python 2.7, and is available for download at http://bitbucket.org/wswp/code. Ideally, in a future version of this book, the examples will be ported to Python 3. However, for now, many of the libraries required (such as Scrapy/Twisted, Mechanize, and Ghost) are only available for Python 2. To help illustrate the crawling examples, we created a sample website at http://example.webscraping.com. This website limits how fast you can download content, so if you prefer to host this yourself the source code and installation instructions are available at http://bitbucket.org/wswp/places.

We decided to build a custom website for many of the examples used in this book instead of scraping live websites, so that we have full control over the environment. This provides us stability—live websites are updated more often than books, and by the time you try a scraping example, it may no longer work. Also, a custom website allows us to craft examples that illustrate specific skills and avoid distractions. Finally, a live website might not appreciate us using them to learn about web scraping and try to block our scrapers. Using our own custom website avoids these risks; however, the skills learnt in these examples can certainly still be applied to live websites.

主站蜘蛛池模板: 德昌县| 日喀则市| 青河县| 云林县| 凭祥市| 汽车| 沙洋县| 黔南| 蚌埠市| 贵州省| 娱乐| 洪江市| 福贡县| 靖边县| 嘉义市| 元氏县| 鄂伦春自治旗| 留坝县| 中西区| 阳信县| 巴东县| 高青县| 莱阳市| 莆田市| 闵行区| 家居| 永宁县| 类乌齐县| 武隆县| 兴安县| 兰溪市| 南和县| 青铜峡市| 盐亭县| 随州市| 交口县| 常德市| 托克托县| 陕西省| 神池县| 克拉玛依市|