- Python Social Media Analytics
- Siddhartha Chatterjee Michal Krystyanczuk
- 131字
- 2021-07-15 17:24:52
Scraping and crawling
Scraping (or web scraping) is a technique to extract information from websites. When we do not have access to APIs, we can only retrieve visible information from HTML generated on a web page. In order to perform the task, we need a scraper that is able to extract information that we need and structure it in a predefined format. The next step is to build a crawler—a tool to follow links on a website and extract the information from all sub pages. When we decide to build a scraping strategy, we have to take into consideration the terms and conditions, as some websites do not allow scraping.
Python offers very useful tools to create scrapers and crawlers, such as beautifulsoup and scrapy.
pip3 install bs4, scrapy
推薦閱讀
- Spring技術內幕:深入解析Spring架構與設計
- Visual FoxPro 程序設計
- Learning Python Design Patterns(Second Edition)
- 青少年Python編程入門
- 程序是怎樣跑起來的(第3版)
- 飛槳PaddlePaddle深度學習實戰
- PhoneGap:Beginner's Guide(Third Edition)
- Visual Basic程序設計教程
- 人工智能算法(卷1):基礎算法
- HTML5+CSS3+jQuery Mobile APP與移動網站設計從入門到精通
- App Inventor 2 Essentials
- R語言數據挖掘:實用項目解析
- SQL Server實例教程(2008版)
- Building Clouds with Windows Azure Pack
- Java EE輕量級解決方案:S2SH