官术网_书友最值得收藏!

Loading the dataset

First of all, it is essential to download the dataset. Follow the preceding steps from the Technical requirements section and download the data. Gmail (https://takeout.google.com/settings/takeout) provides data in mbox format. For this chapter, I loaded my own personal email from Google Mail. For privacy reasons, I cannot share the dataset. However, I will show you different EDA operations that you can perform to analyze several aspects of your email behavior:

Let's load the required libraries:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Note that for this analysis, we need to have the mailbox package installed. If it is not installed on your system, it can be added to your Python build using the  pip install mailbox instruction.

2.hen you have loaded the libraries, load the dataset:

import mailbox

mboxfile = "PATH TO DOWNLOADED MBOX FIL"
mbox = mailbox.mbox(mboxfile)
mbox

Note that it is essential that you replace the mbox file path with your own path.

The output of the preceding code is as follows:

<mailbox.mbox at 0x7f124763f5c0>

The output indicates that the mailbox has been successfully created.

3.ext, let's see the list of available keys:

for key in mbox[0].keys():
print(key)

The output of the preceding code is as follows:

X-GM-THRID
X-Gmail-Labels
Delivered-To
Received
X-Google-Smtp-Source
X-Received
ARC-Seal
ARC-Message-Signature
ARC-Authentication-Results
Return-Path
Received
Received-SPF
Authentication-Results
DKIM-Signature
DKIM-Signature
Subject
From
To
Reply-To
Date
MIME-Version
Content-Type
X-Mailer
X-Complaints-To
X-Feedback-ID
List-Unsubscribe
Message-ID

The preceding output shows the list of keys that are present in the extracted dataset. 

主站蜘蛛池模板: 大英县| 西城区| 达拉特旗| 南阳市| 莒南县| 土默特右旗| 青田县| 巴林左旗| 阿克苏市| 罗定市| 岱山县| 文山县| 盘山县| 廉江市| 筠连县| 扎赉特旗| 台山市| 肃北| 友谊县| 泰顺县| 红安县| 岑溪市| 信丰县| 西吉县| 余姚市| 奉新县| 冷水江市| 镇巴县| 广宗县| 塔城市| 延长县| 永吉县| 梨树县| 定兴县| 沿河| 晋州市| 凤山市| 兴化市| 收藏| 黄龙县| 寻乌县|