官术网_书友最值得收藏!

Is web scraping legal?

Web scraping, and what is legally permissible when web scraping, are still being established despite numerous rulings over the past two decades. If the scraped data is being used for personal and private use, and within fair use of copyright laws, there is usually no problem. However, if the data is going to be republished, if the scraping is aggressive enough to take down the site, or if the content is copyrighted and the scraper violates the terms of service, then there are several legal precedents to note.

In Feist Publications, Inc. v. Rural Telephone Service Co., the United States Supreme Court decided scraping and republishing facts, such as telephone listings, are allowed. A similar case in Australia, Telstra Corporation Limited v. Phone Directories Company Pty Ltd, demonstrated that only data with an identifiable author can be copyrighted. Another scraped content case in the United States, evaluating the reuse of Associated Press stories for an aggregated news product, was ruled a violation of copyright in Associated Press v. Meltwater.  A European Union case in Denmark, ofir.dk vs home.dk, concluded that regular crawling and deep linking is permissible.

There have also been several cases in which companies have charged the plaintiff with aggressive scraping and attempted to stop the scraping via a legal order. The most recent case, QVC v. Resultly, ruled that, unless the scraping resulted in private property damage, it could not be considered intentional harm, despite the crawler activity leading to some site stability issues.

These cases suggest that, when the scraped data constitutes public facts (such as business locations and telephone listings), it can be republished following fair use rules. However, if the data is original (such as opinions and reviews or private user data), it most likely cannot be republished for copyright reasons. In any case, when you are scraping data from a website, remember you are their guest and need to behave politely; otherwise, they may ban your IP address or proceed with legal action. This means you should make download requests at a reasonable rate and define a user agent to identify your crawler. You should also take measures to review the Terms of Service of the site and ensure the data you are taking is not considered private or copyrighted.

If you have doubts or questions, it may be worthwhile to consult a media lawyer regarding the precedents in your area of residence. 

You can read more about these legal cases at the following sites:

主站蜘蛛池模板: 敖汉旗| 中宁县| 腾冲县| 沂源县| 当阳市| 那曲县| 嵊州市| 永顺县| 阳西县| 醴陵市| 建始县| 蕉岭县| 青神县| 蕉岭县| 耒阳市| 鹤山市| 邵东县| 沁阳市| 青州市| 神木县| 富平县| 江北区| 宣城市| 福州市| 中西区| 松滋市| 和平区| 六枝特区| 郁南县| 金门县| 东城区| 浪卡子县| 武隆县| 滨海县| 寿阳县| 吉林省| 奎屯市| 惠安县| 云南省| 台安县| 资阳市|