- Practical Big Data Analytics
- Nataraj Dasgupta
- 167字
- 2021-07-02 19:26:25
Summary
This chapter introduced some of the key tools used for data science. In particular, it demonstrated how to download and install the virtual machine for the Cloudera Distribution of Hadoop (CDH), Spark, R, RStudio, and Python. Although the user can download the source code of Hadoop and install it on, say, a Unix system, it is usually fraught with issues and requires a fair amount of debugging. Using a VM instead allows the user to begin using and learning Hadoop with minimal effort as it is a complete preconfigured environment.
Additionally, R and Python are the two most commonly used languages for machine learning and in general, analytics. They are available for all popular operating systems. Although they can be installed in the VM, the user is encouraged to try and install them on their local machines (laptop/workstation) if feasible as it will have relatively higher performance.
In the next chapter, we will pe deeper into the details of Hadoop and its core components and concepts.
- 網上沖浪
- 人工免疫算法改進及其應用
- 大數據時代的數據挖掘
- Hands-On Machine Learning with TensorFlow.js
- RPA:流程自動化引領數字勞動力革命
- AWS Administration Cookbook
- Docker High Performance(Second Edition)
- 可編程序控制器應用實訓(三菱機型)
- 統計學習理論與方法:R語言版
- Red Hat Linux 9實務自學手冊
- Salesforce Advanced Administrator Certification Guide
- 水晶石影視動畫精粹:After Effects & Nuke 影視后期合成
- 寒江獨釣:Windows內核安全編程
- 未來學徒:讀懂人工智能飛馳時代
- Xilinx FPGA高級設計及應用