官术网_书友最值得收藏!

Beautiful Soup

Beautiful Soup, a creation of Leonard Richardson, is a great tool to scrap out data from HTML and XML files that are retrieved from the internet. It works incredibly well, even in the case of tag soups (hence the name), which are collections of malformed, contradictory, and incorrect tags. After choosing your parser (the HTML parser included in Python's standard library works fine), thanks to Beautiful Soup, you can navigate through the objects in the page and extract text, tables, and any other information that you may find useful:

Note that the imported module is named bs4.
主站蜘蛛池模板: 桂阳县| 阿合奇县| 蓬安县| 呼玛县| 和田县| 阳西县| 肥乡县| 金阳县| 九台市| 德保县| 苗栗市| 兴国县| 安宁市| 万载县| 子洲县| 永嘉县| 田林县| 筠连县| 逊克县| 博兴县| 上饶县| 吉水县| 曲阜市| 治县。| 广汉市| 类乌齐县| 马公市| 新疆| 双流县| 昌江| 察隅县| 汉源县| 绥棱县| 安庆市| 凌源市| 若尔盖县| 松潘县| 诸暨市| 宝鸡市| 玉环县| 昌吉市|