- Hands-On Deep Learning with Apache Spark
- Guglielmo Iozzia
- 275字
- 2021-07-02 13:34:23
Submitting Spark applications on YARN
To launch Spark applications on YARN, the HADOOP_CONF_DIR or YARN_CONF_DIR env variable needs to be set and pointing to the directory that contains the client-side configuration files for the Hadoop cluster. These configurations are needed to connect to the YARN ResourceManager and to write to HDFS. This configuration is distributed to the YARN cluster so that all the containers used by the Spark application have the same configuration. To launch Spark applications on YARN, two deployment modes are available:
- Cluster mode: In this case, the Spark driver runs inside an application master process that's managed by YARN on the cluster. The client can finish its execution after initiating the application.
- Client mode: In this case, the driver runs and the client runs in the same process. The application master is used for the sole purpose of requesting resources from YARN.
Unlike the other modes, in which the master's address is specified in the master parameter, in YARN mode, the ResourceManager's address is retrieved from the Hadoop configuration. Therefore, the master parameter value is always yarn.
You can use the following command to launch a Spark application in cluster mode:
$SPARK_HOME/bin/spark-submit --class path.to.your.Class --master yarn --deploy-mode cluster [options] <app jar> [app options]
In cluster mode, since the driver runs on a different machine than the client, the SparkContext.addJar method doesn't work with the files that are local to the client. The only choice is to include them using the jars option in the launch command.
Launching a Spark application in client mode happens the same way—the deploy-mode option value needs to change from cluster to client.
- Dreamweaver CS3 Ajax網(wǎng)頁設(shè)計(jì)入門與實(shí)例詳解
- Managing Mission:Critical Domains and DNS
- Ansible Quick Start Guide
- Dreamweaver CS3網(wǎng)頁設(shè)計(jì)50例
- HBase Design Patterns
- 80x86/Pentium微型計(jì)算機(jī)原理及應(yīng)用
- iClone 4.31 3D Animation Beginner's Guide
- Photoshop CS3特效處理融會貫通
- 統(tǒng)計(jì)學(xué)習(xí)理論與方法:R語言版
- TensorFlow Reinforcement Learning Quick Start Guide
- R Data Analysis Projects
- Visual Studio 2010 (C#) Windows數(shù)據(jù)庫項(xiàng)目開發(fā)
- Machine Learning Algorithms(Second Edition)
- Microsoft Dynamics CRM 2013 Marketing Automation
- NetSuite ERP for Administrators