- Web Scraping with Python
- Richard Lawson
- 259字
- 2021-07-09 21:28:50
Is web scraping legal?
Web scraping is in the early Wild West stage, where what is permissible is still being established. If the scraped data is being used for personal use, in practice, there is no problem. However, if the data is going to be republished, then the type of data scraped is important.
Several court cases around the world have helped establish what is permissible when scraping a website. In Feist Publications, Inc. v. Rural Telephone Service Co., the United States Supreme Court decided that scraping and republishing facts, such as telephone listings, is allowed. Then, a similar case in Australia, Telstra Corporation Limited v. Phone Directories Company Pty Ltd, demonstrated that only data with an identifiable author can be copyrighted. Also, the European Union case, ofir.dk vs home.dk, concluded that regular crawling and deep linking is permissible.
These cases suggest that when the scraped data constitutes facts (such as business locations and telephone listings), it can be republished. However, if the data is original (such as opinions and reviews), it most likely cannot be republished for copyright reasons.
In any case, when you are scraping data from a website, remember that you are their guest and need to behave politely or they may ban your IP address or proceed with legal action. This means that you should make download requests at a reasonable rate and define a user agent to identify you. The next section on crawling will cover these practices in detail.
Note
You can read more about these legal cases at http://caselaw.lp.findlaw.com/scripts/getcase.pl?court=US&vol=499&invol=340, http://www.austlii.edu.au/au/cases/cth/FCA/2010/44.html, and http://www.bvhd.dk/uploads/tx_mocarticles/S_-_og_Handelsrettens_afg_relse_i_Ofir-sagen.pdf.
- Visual Basic程序開發(fā)(學(xué)習(xí)筆記)
- 自己動手實現(xiàn)Lua:虛擬機、編譯器和標(biāo)準(zhǔn)庫
- C語言程序設(shè)計(第2版)
- 用Python實現(xiàn)深度學(xué)習(xí)框架
- Android程序設(shè)計基礎(chǔ)
- Apache Kafka Quick Start Guide
- RSpec Essentials
- Babylon.js Essentials
- Test-Driven Machine Learning
- 計算機應(yīng)用基礎(chǔ)項目化教程
- 軟件測試技術(shù)
- C語言從入門到精通(視頻實戰(zhàn)版)
- Ionic Cookbook
- 深入解析WPF編程
- 云原生基礎(chǔ)架構(gòu):構(gòu)建和管理現(xiàn)代可擴展基礎(chǔ)架構(gòu)的模式及實踐