官术网_书友最值得收藏!

Data cleansing

 Let's create a CSV file with only the required fields. Let's start with the following steps:

Import the csv package:

import csv

2.reate a CSV file with only the required attributes:

with open('mailbox.csv', 'w') as outputfile:
writer = csv.writer(outputfile)
writer.writerow(['subject','from','date','to','label','thread'])

for message in mbox:
writer.writerow([
message['subject'],
message['from'],
message['date'],
message['to'],
message['X-Gmail-Labels'],
message['X-GM-THRID']
]
)

The preceding output is a csv file named mailbox.csv. Next, instead of loading the mbox file, we can use the CSV file for loading, which will be smaller than the original dataset.

主站蜘蛛池模板: 镇巴县| 安福县| 柳州市| 乌兰浩特市| 梨树县| 永川市| 诏安县| 凌源市| 吴桥县| 灵武市| 霍邱县| 林甸县| 丰台区| 北辰区| 长阳| 阜新| 抚远县| 福海县| 昌邑市| 涞水县| 葵青区| 翁牛特旗| 华坪县| 宝坻区| 桂林市| 灌阳县| 辽阳县| 营山县| 衡阳市| 延长县| 辉县市| 汝城县| 金沙县| 北川| 明溪县| 阿克陶县| 绥滨县| 司法| 电白县| 大渡口区| 澳门|