官术网_书友最值得收藏!

What you need for this book

To complete the projects in this book, you will need a version of Python 3.5 or higher. I recommend using Anaconda Python, but any Python distribution will do as long as it is updated and contains the following packages: Numpy, Matplotlib, NetworkX, PyMySQL, Gensim, and NLTK. In Chapter 1, Expanding Your Data Mining Toolbox, we will walk through an easy installation of Python and all these libraries, and each time a library is used later in the book, we will install it or upgrade it together.

Because data mining is obviously data-centric, and because the data sets we are working with are sometimes large or require some type of persistent data storage, I chose to implement some of the data mining algorithms alongside a relational database system. I chose MySQL for accomplishing this since it is an established, easy-to-download and install piece of infrastructure. The chapters where MySQL comes into play are in working with the memory-intensive algorithms in Chapter 2, Association Rule Mining, and Chapter 3, Entity Matching. I also use MySQL for some of the examples in Chapter 9, Mining for Data Anomalies, but it is possible to go through that chapter without MySQL.

主站蜘蛛池模板: 新巴尔虎右旗| 濮阳市| 葫芦岛市| 星座| 乐清市| 桂林市| 中宁县| 车致| 周至县| 江山市| 邛崃市| 邻水| 北碚区| 米易县| 日照市| 竹溪县| 夏河县| 洛扎县| 呈贡县| 正安县| 太湖县| 香河县| 双柏县| 祁门县| 丰原市| 广宁县| 长岭县| 湘乡市| 通许县| 韩城市| 凤台县| 麻城市| 延长县| 博客| 昌邑市| 仁布县| 彰化市| 朝阳县| 湘潭县| 洛川县| 天峻县|