官术网_书友最值得收藏!

SparkContext and SparkConf

The starting point of writing any Spark program is SparkContext (or JavaSparkContext in Java). SparkContext is initialized with an instance of a SparkConf object, which contains various Spark cluster-configuration settings (for example, the URL of the master node).

It is a main entry point for Spark functionality. A SparkContext is a connection to a Spark cluster. It can be used to create RDDs, accumulators, and broadcast variables on the cluster.

Only one SparkContext is active per JVM. You must call stop(), which is the active SparkContext, before creating a new one.

Once initialized, we will use the various methods found in the SparkContext object to create and manipulate distributed datasets and shared variables. The Spark shell (in both Scala and Python, which is unfortunately not supported in Java) takes care of this context initialization for us, but the following lines of code show an example of creating a context running in the local mode in Scala:

val conf = new SparkConf() 
.setAppName("Test Spark App")
.setMaster("local[4]")
val sc = new SparkContext(conf)

This creates a context running in the local mode with four threads, with the name of the application set to Test Spark App. If we wish to use the default configuration values, we could also call the following simple constructor for our SparkContext object, which works in the exact same way:

val sc = new SparkContext("local[4]", "Test Spark App")
Downloading the example code
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book from any other source, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
主站蜘蛛池模板: 云安县| 邯郸市| 云安县| 万源市| 镇沅| 龙川县| 大石桥市| 防城港市| 高台县| 梨树县| 沽源县| 巨鹿县| 汾西县| 藁城市| 册亨县| 徐闻县| 来安县| 冀州市| 洛南县| 聂拉木县| 铜鼓县| 莲花县| 鄂托克旗| 清苑县| 施甸县| 张家界市| 泰兴市| 托克逊县| 寿光市| 泽库县| 南城县| 绵阳市| 霍山县| 平江县| 昌吉市| 报价| 顺义区| 莒南县| 神木县| 刚察县| 沂南县|