官术网_书友最值得收藏!

Getting ready

We will be using the planets data page and converting that data into CSV and JSON files. Let's start by loading the planets data from the page into a list of python dictionary objects. The following code (found in (03/get_planet_data.py) provides a function that performs this task, which will be reused throughout the chapter:

import requests
from bs4 import BeautifulSoup

def get_planet_data():
html = requests.get("http://localhost:8080/planets.html").text
soup = BeautifulSoup(html, "lxml")

planet_trs = soup.html.body.div.table.findAll("tr", {"class": "planet"})

def to_dict(tr):
tds = tr.findAll("td")
planet_data = dict()
planet_data['Name'] = tds[1].text.strip()
planet_data['Mass'] = tds[2].text.strip()
planet_data['Radius'] = tds[3].text.strip()
planet_data['Description'] = tds[4].text.strip()
planet_data['MoreInfo'] = tds[5].findAll("a")[0]["href"].strip()
return planet_data

planets = [to_dict(tr) for tr in planet_trs]

return planets

if __name__ == "__main__":
print(get_planet_data())

Running the script gives the following output (briefly truncated):

03 $python get_planet_data.py
[{'Name': 'Mercury', 'Mass': '0.330', 'Radius': '4879', 'Description': 'Named Mercurius by the Romans because it appears to move so swiftly.', 'MoreInfo': 'https://en.wikipedia.org/wiki/Mercury_(planet)'}, {'Name': 'Venus', 'Mass': '4.87', 'Radius': '12104', 'Description': 'Roman name for the goddess of love. This planet was considered to be the brightest and most beautiful planet or star in the\r\n heavens. Other civilizations have named it for their god or goddess of love/war.', 'MoreInfo': 'https://en.wikipedia.org/wiki/Venus'}, {'Name': 'Earth', 'Mass': '5.97', 'Radius': '12756', 'Description': "The name Earth comes from the Indo-European base 'er,'which produced the Germanic noun 'ertho,' and ultimately German 'erde,'\r\n Dutch 'aarde,' Scandinavian 'jord,' and English 'earth.' Related forms include Greek 'eraze,' meaning\r\n 'on the ground,' and Welsh 'erw,' meaning 'a piece of land.'", 'MoreInfo': 'https://en.wikipedia.org/wiki/Earth'}, {'Name': 'Mars', 'Mass': '0.642', 'Radius': '6792', 'Description': 'Named by the Romans for their god of war because of its red, bloodlike color. Other civilizations also named this planet\r\n from this attribute; for example, the Egyptians named it "Her Desher," meaning "the red one."', 'MoreInfo':
...

It may be required to install csv, json and pandas.  You can do that with the following three commands:

pip install csv
pip install json
pip install pandas
主站蜘蛛池模板: 吉隆县| 山西省| 永吉县| 甘肃省| 龙游县| 彝良县| 广南县| 阿合奇县| 祁东县| 印江| 土默特右旗| 渑池县| 五寨县| 株洲市| 墨玉县| 沽源县| 武强县| 托克逊县| 个旧市| 读书| 东乡| 巫溪县| 淮滨县| 左权县| 德昌县| 崇明县| 蒙阴县| 磐石市| 沈丘县| 偃师市| 宝应县| 息烽县| 盐山县| 星座| 信阳市| 沽源县| 柘城县| 龙胜| 明溪县| 宿州市| 理塘县|