官术网_书友最值得收藏!

Processing the text files

Using SparkContext, it is possible to load a text file in RDD using the textFile method. Additionally, the wholeTextFile method can read the contents of a directory to RDD. The following examples show you how a file, based on the local filesystem (file://) or HDFS (hdfs://), can be read to a Spark RDD. These examples show you that the data will be divided into six partitions for increased performance. The first two examples are the same as they both load a file from the Linux filesystem, whereas the last one resides in HDFS:

sc.textFile("/data/spark/tweets.txt",6)
sc.textFile("file:///data/spark/tweets.txt",6)
sc.textFile("hdfs://server1:4014/data/spark/tweets.txt",6)
主站蜘蛛池模板: 德安县| 胶南市| 莱芜市| 贵溪市| 神农架林区| 托克逊县| 沛县| 中阳县| 樟树市| 张家港市| 沙坪坝区| 巴塘县| 台中县| 嘉黎县| 松潘县| 荥经县| 镇远县| 麟游县| 库车县| 莱阳市| 岗巴县| 肃宁县| 望奎县| 门头沟区| 定襄县| 南江县| 巴马| 巫山县| 介休市| 安仁县| 南宫市| 当雄县| 崇礼县| 甘肃省| 迭部县| 墨竹工卡县| 雅安市| 张家口市| 龙山县| 荔浦县| 阿坝|