官术网_书友最值得收藏!

A single machine

A single machine is the simplest use case for Spark. It is also a great way to sanity check your build. In spark/bin, there is a shell script called run-example, which can be used to launch a Spark job. The run-example script takes the name of a Spark class and some arguments. Earlier, we used the run-example script from the /bin directory to calculate the value of Pi. There is a collection of the sample Spark jobs in examples/src/main/scala/org/apache/spark/examples/.

All of the sample programs take the parameter, master (the cluster manager), which can be the URL of a distributed cluster or local[N], where N is the number of threads.

Going back to our run-example script, it invokes the more general bin/spark-submit script. For now, let's stick with the run-example script.

To run GroupByTest locally, try running the following command:

bin/run-example GroupByTest

This line will produce an output like this given here:

14/11/15 06:28:40 INFO SparkContext: Job finished: count at GroupByTest.scala:51, took 0.494519333 s
2000

Note

All the examples in this book can be run on a Spark installation on a local machine. So you can read through the rest of the chapter for additional information after you have gotten some hands-on exposure to Spark running on your local machine.

主站蜘蛛池模板: 沧源| 宁海县| 灵宝市| 德州市| 安塞县| 平陆县| 化州市| 岑巩县| 师宗县| 越西县| 花莲市| 承德县| 兰坪| 墨竹工卡县| 汾西县| 石阡县| 和平县| 即墨市| 五大连池市| 泰宁县| 东乌珠穆沁旗| 荥阳市| 岢岚县| 元阳县| 永康市| 会泽县| 榆社县| 苍梧县| 额济纳旗| 和平县| 元江| 太仓市| 郓城县| 重庆市| 达拉特旗| 龙南县| 鄂州市| 台中市| 延川县| 襄汾县| 襄垣县|