- Hands-On Deep Learning with Apache Spark
- Guglielmo Iozzia
- 319字
- 2021-07-02 13:34:22
Cluster mode using different managers
The following diagram shows how Spark applications run on a cluster. They are independent sets of processes that are coordinated by the SparkContext object in the Driver Program. SparkContext connects to a Cluster Manager, which is responsible for allocating resources across applications. Once the SparkContext is connected, Spark gets executors across cluster nodes.
Executors are processes that execute computations and store data for a given Spark application. SparkContext sends the application code (which could be a JAR file for Scala or .py files for Python) to the executors. Finally, it sends the tasks to run to the executors:

To isolate applications from each other, every Spark application receives its own executor processes. They stay alive for the duration of the whole application and run tasks in multithreading mode. The downside to this is that it isn't possible to share data across different Spark applications – to share it, data needs to be persisted to an external storage system.
Spark supports different cluster managers, but it is agnostic to the underlying type.
The driver program, at execution time, must be network addressable from the worker nodes because it has to listen for and accept incoming connections from its executors. Because it schedules tasks on the cluster, it should be executed close to the worker nodes, on the same local area network (if possible).
The following are the cluster managers that are currently supported in Spark:
- Standalone: A simple cluster manager that makes it easy to set up a cluster. It is included with Spark.
- Apache Mesos: An open source project that's used to manage computer clusters, and was developed at the University of California, Berkeley.
- Hadoop YARN: The resource manager available in Hadoop starting from release 2.
- Kubernetes: An open source platform for providing a container-centric infrastructure. Kubernetes support in Spark is still experimental, so it's probably not ready for production yet.
- PPT,要你好看
- Learning Microsoft Azure Storage
- 視覺檢測技術(shù)及智能計(jì)算
- 21天學(xué)通Visual Basic
- 3D Printing for Architects with MakerBot
- Spark大數(shù)據(jù)技術(shù)與應(yīng)用
- 數(shù)據(jù)掘金
- 空間站多臂機(jī)器人運(yùn)動控制研究
- 筆記本電腦維修90個(gè)精選實(shí)例
- 計(jì)算機(jī)組網(wǎng)技術(shù)
- 電氣控制與PLC原理及應(yīng)用(歐姆龍機(jī)型)
- 基于RPA技術(shù)財(cái)務(wù)機(jī)器人的應(yīng)用與研究
- Linux常用命令簡明手冊
- 7天精通Photoshop CS5平面視覺設(shè)計(jì)
- 圖像傳感器應(yīng)用技術(shù)