- Go Machine Learning Projects
- Xuanyi Chew
- 186字
- 2021-06-10 18:46:38
The project
What we want to do is simple: given an email, is it kosher (which we call ham), or is it a spam email? We will be using the LingSpam database. The emails from that database are a little dated—spammers update their techniques and words all the time. However, I chose the LingSpam corpus for a good reason: it is already nicely preprocessed. The original scope of this chapter was to introduce the preprocessing of emails; however, the topic of preprocessing options for natural language is itself a topic for an entire book, so we will use a dataset that has already been preprocessed. This allows us to focus more on the mechanics of a very elegant algorithm.
Fear not, though, as I will actually walk through the brief basics of preprocessing. Be warned, however, that the level of complexity jumps up in a very steep curve, so be prepared to be sucked into a black hole of many hours on preprocessing natural language. At the end of this chapter, I will also recommend some libraries that will be useful for preprocessing.
- 21天學(xué)通PHP
- 錯覺:AI 如何通過數(shù)據(jù)挖掘誤導(dǎo)我們
- 輕松學(xué)Java
- Windows程序設(shè)計(jì)與架構(gòu)
- AWS Administration Cookbook
- 21天學(xué)通Visual Basic
- 西門子S7-200 SMART PLC實(shí)例指導(dǎo)學(xué)與用
- 水晶石精粹:3ds max & ZBrush三維數(shù)字靜幀藝術(shù)
- 網(wǎng)絡(luò)組建與互聯(lián)
- 基于單片機(jī)的嵌入式工程開發(fā)詳解
- 人工智能趣味入門:光環(huán)板程序設(shè)計(jì)
- Chef:Powerful Infrastructure Automation
- 智能生產(chǎn)線的重構(gòu)方法
- Visual C++項(xiàng)目開發(fā)案例精粹
- WOW!Photoshop CS6完全自學(xué)寶典