官术网_书友最值得收藏!

Installing Spark

Follow these steps to install Spark 2.3.1, compiled with Hadoop 2.7:

  1. If you have a Spark 2.0 tar distribution (for example, spark-2.3.1-bin-hadoop2.7.tgz), then copy it into your Linux VM at any location (for example, /opt) using any Windows on Linux file transfer software (FileZilla or WinSCP). Alternatively, you can download the latest binary .tar.gz file from the following Apache Spark link: http://spark.apache.org/downloads.html.
The /opt file is an empty folder within root in most Linux-based operating folders. Here, we would use this folder to copy and install software. By default, this folder is owned by Root. So, run the following command if you are getting permission issues while accessing this folder.
  sudo chmod -R 777 /opt.
  1. Go to the location where you have copied the Spark software package and uncompress it:
cd /opt
tar -xzvf spark-2.3.1-bin-hadoop2.7.tgz
  1. Set the environment variable in .bash_profile, as follows:
nano ~/.bash_profile 
  1. Add the following lines to the end of the file:
export SPARK_HOME=/opt/spark-2.3.1-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/sbin
export PATH=$PATH:$SPARK_HOME/bin
  1. Run the following command to update the environment variables in the current session:
source ~/.bash_profile
主站蜘蛛池模板: 沙雅县| 东兰县| 承德市| 祁阳县| 宜君县| 绥棱县| 保靖县| 北川| 临泉县| 石泉县| 抚松县| 社旗县| 自贡市| 福海县| 德安县| 南宁市| 海口市| 通江县| 行唐县| 当涂县| 镇康县| 湟中县| 雷山县| 海安县| 青浦区| 沁阳市| 瑞安市| 承德市| 叙永县| 五指山市| 莱芜市| 临颍县| 新绛县| 霍城县| 长汀县| 庐江县| 吴忠市| 民县| 正蓝旗| 衡东县| 顺义区|