書名： Apache Spark Quick Start Guide
作者名： Shrey Mehrotra Akash Grade
本章字數： 196字
更新時間： 2021-07-02 13:39:59

Installing Spark

Follow these steps to install Spark 2.3.1, compiled with Hadoop 2.7:

If you have a Spark 2.0 tar distribution (for example, spark-2.3.1-bin-hadoop2.7.tgz), then copy it into your Linux VM at any location (for example, /opt) using any Windows on Linux file transfer software (FileZilla or WinSCP). Alternatively, you can download the latest binary .tar.gz file from the following Apache Spark link: http://spark.apache.org/downloads.html.

The /opt file is an empty folder within root in most Linux-based operating folders. Here, we would use this folder to copy and install software. By default, this folder is owned by Root. So, run the following command if you are getting permission issues while accessing this folder.
sudo chmod -R 777 /opt.

Go to the location where you have copied the Spark software package and uncompress it:

cd /opt
tar -xzvf spark-2.3.1-bin-hadoop2.7.tgz

Set the environment variable in .bash_profile, as follows:

nano ~/.bash_profile

Add the following lines to the end of the file:

export SPARK_HOME=/opt/spark-2.3.1-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/sbin
export PATH=$PATH:$SPARK_HOME/bin

Run the following command to update the environment variables in the current session:

source ~/.bash_profile

官术网_书友最值得收藏!

Apache Spark Quick Start Guide

Installing Spark