- Apache Spark 2.x for Java Developers
- Sourav Gulati Sumit Kumar
- 351字
- 2021-07-02 19:02:00
Getting started with Spark
In this section, we will run Apache Spark in local mode or standalone mode. First we will set up Scala, which is the prerequisite for Apache Spark. After the Scala setup, we will set up and run Apache Spark. We will also perform some basic operations on it. So let's start.
Since Apache Spark is written in Scala, it needs Scala to be set up on the system. You can download Scala from http://www.scala-lang.org/download/ (we will set up Scala 2.11.8 in the following examples).
Once Scala is downloaded, we can set it up on a Linux system as follows:

Also, it is recommended to set the SCALA_HOME environment variable and add Scala binaries to the PATH variable. You can set it in the .bashrc file or /etc/environment file as follows:
export SCALA_HOME=/usr/local/scala-2.11.8
export PATH=$PATH:/usr/local/scala-2.11.8/bin
It is also shown in the following screenshot:

Now, we have set up a Scala environment successfully. So, it is time to download Apache Spark. You can download it from http://spark.apache.org/downloads.html.
After Apache Spark is downloaded, run the following commands to set it up:
tar -zxf spark-2.0.0-bin-hadoop2.7.tgz
sudo mv spark-2.0.0-bin-hadoop2.7 /usr/local/spark
Also, you can set environment variable SPARK_HOME. It is not mandatory; however, it helps the user to find the installation directory of Spark. Also, you can add the path of Spark binaries in the $PATH variable for accessing them without specifying their path:
export SPARK_HOME=/usr/local/spark
export PATH=$PATH:/usr/local/scala-2.11.8/bin:$SPARK_HOME/bin
It is shown in the following screenshot:

Now, we are ready to start Spark in standalone mode. Let's run the following command to start it:
$SPARK_HOME/bin/spark-shell
Also, we can simply execute the spark-shellcommand as Spark binaries are added to the environment variable PATH.

Also, you can access Spark Driver's UI at http://localhost:4040:

We will discuss more about Spark UI in the Spark Driver Web UI section of this chapter.
In this section, we have completed the Spark setup in standalone mode. In the next section, we will do some hands on Apache Spark, using spark-shell or spark-cli.
- Java完全自學教程
- Troubleshooting PostgreSQL
- WebRTC技術詳解:從0到1構建多人視頻會議系統
- SQL Server實用教程(SQL Server 2008版)
- Learning Apache Karaf
- Android移動開發案例教程:基于Android Studio開發環境
- 多媒體技術及應用
- Python硬件編程實戰
- Python第三方庫開發應用實戰
- DevOps 精要:業務視角
- KnockoutJS Blueprints
- 語義Web編程
- VBA Automation for Excel 2019 Cookbook
- 代碼整潔之道:程序員的職業素養
- Abaqus GUI程序開發指南(Python語言)