官术网_书友最值得收藏!

Data cleansing

 Let's create a CSV file with only the required fields. Let's start with the following steps:

Import the csv package:

import csv

2.reate a CSV file with only the required attributes:

with open('mailbox.csv', 'w') as outputfile:
writer = csv.writer(outputfile)
writer.writerow(['subject','from','date','to','label','thread'])

for message in mbox:
writer.writerow([
message['subject'],
message['from'],
message['date'],
message['to'],
message['X-Gmail-Labels'],
message['X-GM-THRID']
]
)

The preceding output is a csv file named mailbox.csv. Next, instead of loading the mbox file, we can use the CSV file for loading, which will be smaller than the original dataset.

主站蜘蛛池模板: 泽普县| 房产| 昔阳县| 饶阳县| 镇坪县| 商洛市| 乐陵市| 大城县| 平阳县| 家居| 桃江县| 乌兰浩特市| 铜山县| 神木县| 乌兰察布市| 宜君县| 临邑县| 兖州市| 凉城县| 东兰县| 玉门市| 龙游县| 来安县| 宁明县| 年辖:市辖区| 涟源市| 林芝县| 宁晋县| 精河县| 原平市| 通化县| 龙口市| 安图县| 皋兰县| 东莞市| 漳浦县| 乌兰察布市| 杨浦区| 华安县| 衡水市| 弥渡县|