- Hands-On Deep Learning with Apache Spark
- Guglielmo Iozzia
- 555字
- 2021-07-02 13:34:23
Mesos cluster mode
Spark can run on clusters that are managed by Apache Mesos (http://mesos.apache.org/). Mesos is a cross-platform, cloud provider-agnostic, centralized, and fault-tolerant cluster manager, designed for distributed computing environments. Among its main features, it provides resource management and isolation, and the scheduling of CPU and memory across the cluster. It can join multiple physical resources into a single virtual one, and in doing so is different from classic virtualization, where a single physical resource is split into multiple virtual resources. With Mesos, it is possible to build or schedule cluster frameworks such as Apache Spark (though it is not restricted to just this). The following diagram shows the Mesos architecture:

Mesos consists of a master daemon and frameworks. The master daemon manages agent daemons running on each cluster node, while the Mesos frameworks run tasks on the agents. The master empowers fine-grained sharing of resources (including CPU and RAM) across frameworks by making them resource offers. It decides how much of the available resources to offer to each framework, depending on given organizational policies. To support diverse sets of policies, the master uses a modular architecture that makes it easy to add new allocation modules through a plugin mechanism. A Mesos framework consists of two components – a scheduler, which registers itself with the master to be offered resources, and an executor, a process that is launched on agent nodes to execute the framework's tasks. While it is the master that determines how many resources are offered to each framework, the frameworks' schedulers are responsible for selecting which of the offered resources to use. The moment a framework accepts offered resources, it passes a description of the tasks it wants to execute on them to Mesos. Mesos, in turn, launches the tasks on the corresponding agents.
The advantages of deploying a Spark cluster using Mesos to replace the Spark Master Manager include the following:
- Dynamic partitioning between Spark and other frameworks
- Scalable partitioning between multiple instances of Spark
Spark 2.2.1 is designed to be used with Mesos 1.0.0+. In this section, I won't describe the steps to deploy a Mesos cluster – I am assuming that a Mesos cluster is already available and running. No particular procedure or patch is required in terms of Mesos installation to run Spark on it. To verify that the Mesos cluster is ready for Spark, navigate to the Mesos master web UI at port 5050:

Check that all of the expected machines are present in the Agents tab.
To use Mesos from Spark, a Spark binary package needs to be available in a place that's accessible by Mesos itself, and a Spark driver program needs to be configured to connect to Mesos. Alternatively, it is possible to install Spark in the same location across all the Mesos slaves and then configure the spark.mesos.executor.home property (the default value is $SPARK_HOME) to point to that location.
The Mesos master URLs have the form mesos://host:5050 for a single-master Mesos cluster, or mesos://zk://host1:2181,host2:2181,host3:2181/mesos for a multi-master Mesos cluster when using Zookeeper.
The following is an example of how to start a Spark shell on a Mesos cluster:
$SPARK_HOME/bin/spark-shell --master mesos://127.0.0.1:5050 -c spark.mesos.executor.home=`pwd`
A Spark application can be submitted to a Mesos managed Spark cluster as follows:
$SPARK_HOME/bin/spark-submit --master mesos://127.0.0.1:5050 --total-executor-cores 2 --executor-memory 3G $SPARK_HOME/examples/src/main/python/pi.py 100
- 亮劍.NET:.NET深入體驗與實戰(zhàn)精要
- 后稀缺:自動化與未來工作
- 錯覺:AI 如何通過數(shù)據(jù)挖掘誤導(dǎo)我們
- Dreamweaver CS3網(wǎng)頁設(shè)計與網(wǎng)站建設(shè)詳解
- STM32G4入門與電機(jī)控制實戰(zhàn):基于X-CUBE-MCSDK的無刷直流電機(jī)與永磁同步電機(jī)控制實現(xiàn)
- Python Algorithmic Trading Cookbook
- 大數(shù)據(jù)挑戰(zhàn)與NoSQL數(shù)據(jù)庫技術(shù)
- RPA:流程自動化引領(lǐng)數(shù)字勞動力革命
- 計算機(jī)網(wǎng)絡(luò)技術(shù)基礎(chǔ)
- Hybrid Cloud for Architects
- 嵌入式操作系統(tǒng)
- 大學(xué)C/C++語言程序設(shè)計基礎(chǔ)
- Linux服務(wù)與安全管理
- FANUC工業(yè)機(jī)器人配置與編程技術(shù)
- 步步驚“芯”