- Apache Spark 2.x for Java Developers
- Sourav Gulati Sumit Kumar
- 351字
- 2021-07-02 19:02:00
Getting started with Spark
In this section, we will run Apache Spark in local mode or standalone mode. First we will set up Scala, which is the prerequisite for Apache Spark. After the Scala setup, we will set up and run Apache Spark. We will also perform some basic operations on it. So let's start.
Since Apache Spark is written in Scala, it needs Scala to be set up on the system. You can download Scala from http://www.scala-lang.org/download/ (we will set up Scala 2.11.8 in the following examples).
Once Scala is downloaded, we can set it up on a Linux system as follows:

Also, it is recommended to set the SCALA_HOME environment variable and add Scala binaries to the PATH variable. You can set it in the .bashrc file or /etc/environment file as follows:
export SCALA_HOME=/usr/local/scala-2.11.8
export PATH=$PATH:/usr/local/scala-2.11.8/bin
It is also shown in the following screenshot:

Now, we have set up a Scala environment successfully. So, it is time to download Apache Spark. You can download it from http://spark.apache.org/downloads.html.
After Apache Spark is downloaded, run the following commands to set it up:
tar -zxf spark-2.0.0-bin-hadoop2.7.tgz
sudo mv spark-2.0.0-bin-hadoop2.7 /usr/local/spark
Also, you can set environment variable SPARK_HOME. It is not mandatory; however, it helps the user to find the installation directory of Spark. Also, you can add the path of Spark binaries in the $PATH variable for accessing them without specifying their path:
export SPARK_HOME=/usr/local/spark
export PATH=$PATH:/usr/local/scala-2.11.8/bin:$SPARK_HOME/bin
It is shown in the following screenshot:

Now, we are ready to start Spark in standalone mode. Let's run the following command to start it:
$SPARK_HOME/bin/spark-shell
Also, we can simply execute the spark-shellcommand as Spark binaries are added to the environment variable PATH.

Also, you can access Spark Driver's UI at http://localhost:4040:

We will discuss more about Spark UI in the Spark Driver Web UI section of this chapter.
In this section, we have completed the Spark setup in standalone mode. In the next section, we will do some hands on Apache Spark, using spark-shell or spark-cli.
- Java EE 6 企業(yè)級應(yīng)用開發(fā)教程
- Learning RxJava
- SQL for Data Analytics
- Lua程序設(shè)計(jì)(第4版)
- Mastering Google App Engine
- INSTANT Passbook App Development for iOS How-to
- Hands-On Natural Language Processing with Python
- Android應(yīng)用案例開發(fā)大全(第二版)
- Mastering Backbone.js
- Fast Data Processing with Spark(Second Edition)
- Building Serverless Web Applications
- Scrapy網(wǎng)絡(luò)爬蟲實(shí)戰(zhàn)
- INSTANT JQuery Flot Visual Data Analysis
- SAP HANA Cookbook
- Python服務(wù)端測試開發(fā)實(shí)戰(zhàn)