書名： Python Data Science Essentials
作者名： Alberto Boschetti Luca Massaron
本章字?jǐn)?shù)： 114字
更新時間： 2021-08-13 15:19:34

Beautiful Soup

Beautiful Soup, a creation of Leonard Richardson, is a great tool to scrap out data from HTML and XML files that are retrieved from the internet. It works incredibly well, even in the case of tag soups (hence the name), which are collections of malformed, contradictory, and incorrect tags. After choosing your parser (the HTML parser included in Python's standard library works fine), thanks to Beautiful Soup, you can navigate through the objects in the page and extract text, tables, and any other information that you may find useful:

Website: http://www.crummy.com/software/BeautifulSoup
Version at the time of print: 4.6.0
Suggested install command: pip install beautifulsoup4

Note that the imported module is named bs4.

官术网_书友最值得收藏!

Python Data Science Essentials

Beautiful Soup