官术网_书友最值得收藏!

Loading the dataset

First of all, it is essential to download the dataset. Follow the preceding steps from the Technical requirements section and download the data. Gmail (https://takeout.google.com/settings/takeout) provides data in mbox format. For this chapter, I loaded my own personal email from Google Mail. For privacy reasons, I cannot share the dataset. However, I will show you different EDA operations that you can perform to analyze several aspects of your email behavior:

Let's load the required libraries:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Note that for this analysis, we need to have the mailbox package installed. If it is not installed on your system, it can be added to your Python build using the  pip install mailbox instruction.

2.hen you have loaded the libraries, load the dataset:

import mailbox

mboxfile = "PATH TO DOWNLOADED MBOX FIL"
mbox = mailbox.mbox(mboxfile)
mbox

Note that it is essential that you replace the mbox file path with your own path.

The output of the preceding code is as follows:

<mailbox.mbox at 0x7f124763f5c0>

The output indicates that the mailbox has been successfully created.

3.ext, let's see the list of available keys:

for key in mbox[0].keys():
print(key)

The output of the preceding code is as follows:

X-GM-THRID
X-Gmail-Labels
Delivered-To
Received
X-Google-Smtp-Source
X-Received
ARC-Seal
ARC-Message-Signature
ARC-Authentication-Results
Return-Path
Received
Received-SPF
Authentication-Results
DKIM-Signature
DKIM-Signature
Subject
From
To
Reply-To
Date
MIME-Version
Content-Type
X-Mailer
X-Complaints-To
X-Feedback-ID
List-Unsubscribe
Message-ID

The preceding output shows the list of keys that are present in the extracted dataset. 

主站蜘蛛池模板: 慈溪市| 临澧县| 威宁| 南澳县| 高台县| 杂多县| 双鸭山市| 东宁县| 黄山市| 遵化市| 永昌县| 贵州省| 田林县| 罗田县| 临潭县| 玉溪市| 沾化县| 政和县| 内丘县| 海晏县| 沛县| 札达县| 枞阳县| 怀来县| 乐山市| 奉贤区| 武宣县| 松桃| 日照市| 宣恩县| 策勒县| 交城县| 繁峙县| 南部县| 宜兰县| 靖江市| 扶沟县| 泸水县| 波密县| 酒泉市| 兰坪|