官术网_书友最值得收藏!

Loading the dataset

First of all, it is essential to download the dataset. Follow the preceding steps from the Technical requirements section and download the data. Gmail (https://takeout.google.com/settings/takeout) provides data in mbox format. For this chapter, I loaded my own personal email from Google Mail. For privacy reasons, I cannot share the dataset. However, I will show you different EDA operations that you can perform to analyze several aspects of your email behavior:

Let's load the required libraries:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Note that for this analysis, we need to have the mailbox package installed. If it is not installed on your system, it can be added to your Python build using the  pip install mailbox instruction.

2.hen you have loaded the libraries, load the dataset:

import mailbox

mboxfile = "PATH TO DOWNLOADED MBOX FIL"
mbox = mailbox.mbox(mboxfile)
mbox

Note that it is essential that you replace the mbox file path with your own path.

The output of the preceding code is as follows:

<mailbox.mbox at 0x7f124763f5c0>

The output indicates that the mailbox has been successfully created.

3.ext, let's see the list of available keys:

for key in mbox[0].keys():
print(key)

The output of the preceding code is as follows:

X-GM-THRID
X-Gmail-Labels
Delivered-To
Received
X-Google-Smtp-Source
X-Received
ARC-Seal
ARC-Message-Signature
ARC-Authentication-Results
Return-Path
Received
Received-SPF
Authentication-Results
DKIM-Signature
DKIM-Signature
Subject
From
To
Reply-To
Date
MIME-Version
Content-Type
X-Mailer
X-Complaints-To
X-Feedback-ID
List-Unsubscribe
Message-ID

The preceding output shows the list of keys that are present in the extracted dataset. 

主站蜘蛛池模板: 石家庄市| 资中县| 德庆县| 望都县| 凤山市| 巴青县| 高安市| 收藏| 凤凰县| 昌平区| 霞浦县| 阿克陶县| 临海市| 河池市| 玉田县| 榆树市| 兖州市| 酒泉市| 肥乡县| 扎鲁特旗| 鲁山县| 灌阳县| 普格县| 鱼台县| 岳阳市| 垫江县| 锦屏县| 金沙县| 北碚区| 洪洞县| 香河县| 买车| 灵寿县| 台南市| 承德县| 滨海县| 文安县| 西乌| 滕州市| 洞头县| 盖州市|