- Python Machine Learning By Example
- Yuxi (Hayden) Liu
- 507字
- 2021-07-02 22:57:20
Installing software and setting up
For most projects in this book we need scikit-learn (refer to, http://scikit-learn.org/stable/install.html) and matplotlib (refer to, http://matplotlib.org/users/installing.html). Both packages require NumPy, but we also need SciPy for sparse matrices as mentioned before. The scikit-learn library is a machine learning package, which is optimized for performance as a lot of the code runs almost as fast as equivalent C code. The same statement is true for NumPy and SciPy. There are various ways to speed up the code, however they are out of scope for this book, so if you want to know more, please consult the documentation.
matplotlib is a plotting and visualization package. We can also use the seaborn package for visualization. Seaborn uses matplotlib under the hood. There are several other Python visualization packages that cover different usage scenarios. matplotlib and seaborn are mostly useful for the visualization for small to medium datasets. The NumPy package offers the ndarray class and various useful array functions. The ndarray class is an array, that can be one or multi-dimensional. This class also has several subclasses representing matrices, masked arrays, and heterogeneous record arrays. In machine learning we mainly use NumPy arrays to store feature vectors or matrices composed of feature vectors. SciPy uses NumPy arrays and offers a variety of scientific and mathematical functions. We also require the pandas library for data wrangling.
In this book, we will use Python 3. As you may know, Python 2 will no longer be supported after 2020, so I strongly recommend switching to Python 3. If you are stuck with Python 2 you should still be able to modify the example code to work for you. In my opinion, the Anaconda Python 3 distribution is the best option. Anaconda is a free Python distribution for data analysis and scientific computing. It has its own package manager, conda. The distribution includes more than 200 Python packages, which makes it very convenient. For casual users, the Miniconda distribution may be the better choice. Miniconda contains the conda package manager and Python.
The procedures to install Anaconda and Miniconda are similar. Obviously, Anaconda takes more disk space. Follow the instructions from the Anaconda website at http://conda.pydata.org/docs/install/quick.html. First, you have to download the appropriate installer for your operating system and Python version. Sometimes you can choose between a GUI and a command line installer. I used the Python 3 installer, although my system Python version is 2.7. This is possible since Anaconda comes with its own Python. On my machine the Anaconda installer created an anaconda directory in my home directory and required about 900 MB. The Miniconda installer installs a miniconda directory in your home directory. Installation instructions for NumPy are at http://docs.scipy.org/doc/numpy/user/install.html.
Alternatively install NumPy with pip as follows:
$ [sudo] pip install numpy
The command for Anaconda users is:
$ conda install numpy
To install the other dependencies, substitute NumPy by the appropriate package. Please read the documentation carefully, not all options work equally well for each operating system. The pandas installation documentation is at http://pandas.pydata.org/pandas-docs/dev/install.html.
- Cocos2d-x游戲開發(fā):手把手教你Lua語言的編程方法
- Building Cross-Platform Desktop Applications with Electron
- JSP開發(fā)案例教程
- SAP BusinessObjects Dashboards 4.1 Cookbook
- Kubernetes進(jìn)階實戰(zhàn)
- 人工智能算法(卷1):基礎(chǔ)算法
- Java并發(fā)編程之美
- Qlik Sense? Cookbook
- 寫給大家看的Midjourney設(shè)計書
- Android移動應(yīng)用項目化教程
- Python數(shù)據(jù)科學(xué)實踐指南
- C語言程序設(shè)計教程
- Mastering Bootstrap 4
- Getting Started with Web Components
- 基于MATLAB的控制系統(tǒng)仿真及應(yīng)用