官术网_书友最值得收藏!

Installing the Data Science Toolbox

The Data Science Toolbox (DST) is a virtual environment based on Ubuntu for data analysis using Python and R. Since DST is a virtual environment, we can install it on various operating systems. We will install DST locally, which requires VirtualBox and Vagrant. VirtualBox is a virtual machine application originally created by Innotek GmbH in 2007. Vagrant is a wrapper around virtual machine applications such as VirtualBox created by Mitchell Hashimoto.

Getting ready

You need to have in the order of 2 to 3 GB free for VirtualBox, Vagrant, and DST itself. This may vary by operating system.

How to do it...

Installing DST requires the following steps:

  1. Install VirtualBox by downloading an installer for your operating system and architecture from https://www.virtualbox.org/wiki/Downloads (retrieved July 2015) and running it. I installed VirtualBox 4.3.28-100309 myself, but you can just install whatever the most recent VirtualBox version at the time is.
  2. Install Vagrant by downloading an installer for your operating system and architecture from https://www.vagrantup.com/downloads.html (retrieved July 2015). I installed Vagrant 1.7.2 and again you can install a more recent version if available.
  3. Create a directory to hold the DST and navigate to it with a terminal. Run the following command:
    $ vagrant init data-science-toolbox/dst
    $ vagrant up
    

    The first command creates a VagrantFile configuration file. Most of the content is commented out, but the file does contain links to documentation that might be useful. The second command creates the DST and initiates a download that could take a couple of minutes.

  4. Connect to the virtual environment as follows (on Windows use putty):
    $ vagrant ssh
    
  5. View the preinstalled Python packages with the following command:
    vagrant@data-science-toolbox:~$ pip freeze
    

    The list is quite long; in my case it contained 32 packages. The DST Python version as of July 2015 was 2.7.6.

  6. When you are done with the DST, log out and suspend (you can also halt it completely) the VM:
    vagrant@data-science-toolbox:~$ logout
    Connection to 127.0.0.1 closed.
    $ vagrant suspend
    ==> default: Saving VM state and suspending execution...
    

How it works...

Virtual machines (VMs) emulate computers in software. VirtualBox is an application that creates and manages VMs. VirtualBox stores its VMs in your home folder, and this particular VM takes about 2.2 GB of storage.

Ubuntu is an open source Linux operating system, and we are allowed by its license to create virtual machines. Ubuntu has several versions; we can get more info with the lsb_release command:

vagrant@data-science-toolbox:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04 LTS
Release: 14.04
Codename: trusty

Vagrant used to only work with VirtualBox, but currently it also supports VMware, KVM, Docker, and Amazon EC2. Vagrant calls virtual machines boxes. Some of these boxes are available for everyone at http://www.vagrantbox.es/ (retrieved July 2015).

See also

主站蜘蛛池模板: 皮山县| 开远市| 安远县| 长春市| 丹东市| 手游| 萨嘎县| 黄梅县| 武夷山市| 古浪县| 丹棱县| 新安县| 湖口县| 玉龙| 什邡市| 浑源县| 林甸县| 浮山县| 冕宁县| 萝北县| 牟定县| 连云港市| 蓝田县| 莱芜市| 武功县| 文安县| 上饶县| 班戈县| 高清| 呼伦贝尔市| 视频| 吉木萨尔县| 九龙县| 哈巴河县| 临安市| 武宁县| 康平县| 溧水县| 彭泽县| 卢湾区| 昭平县|