- Hands-On Exploratory Data Analysis with Python
- Suresh Kumar Mukhiya Usman Ahmed
- 159字
- 2021-06-24 16:44:58
Most frequently used words
One of the easiest things to analyze about your emails is the most frequently used words. We can create a word cloud to see the most frequently used words. Let's first remove the archived emails:
from wordcloud import WordCloud
df_no_arxiv = dfs[dfs['from'] != 'no-reply@arXiv.org']
text = ' '.join(map(str, sent['subject'].values))
Next, let's plot the word cloud:
stopwords = ['Re', 'Fwd', '3A_']
wrd = WordCloud(width=700, height=480, margin=0, collocations=False)
for sw in stopwords:
wrd.stopwords.add(sw)
wordcloud = wrd.generate(text)
plt.figure(figsize=(25,15))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.margins(x=0, y=0)
I added some extra stop words to filter out from the graph. The output for me is as follows:
This tells me what I mostly communicate about. From the analysis of emails from 2011 to 2019, the most frequently used words are new, site, project, Data, WordPress, and website. This is really good, right? What is presented in this chapter is just a starting point. You can take this further in several other directions.
推薦閱讀
- Learning Python Web Penetration Testing
- R語言數據分析從入門到精通
- Boost C++ Application Development Cookbook(Second Edition)
- Spring Boot+Spring Cloud+Vue+Element項目實戰:手把手教你開發權限管理系統
- FLL+WRO樂高機器人競賽教程:機械、巡線與PID
- 基于Swift語言的iOS App 商業實戰教程
- Reactive Android Programming
- Spring核心技術和案例實戰
- Vue.js光速入門及企業項目開發實戰
- Building Business Websites with Squarespace 7(Second Edition)
- 用Python動手學統計學
- 游戲設計的底層邏輯
- Mastering Clojure
- Python Django Web從入門到項目實戰(視頻版)
- 第五空間戰略:大國間的網絡博弈