官术网_书友最值得收藏!

Processing the text files

Using SparkContext, it is possible to load a text file in RDD using the textFile method. Additionally, the wholeTextFile method can read the contents of a directory to RDD. The following examples show you how a file, based on the local filesystem (file://) or HDFS (hdfs://), can be read to a Spark RDD. These examples show you that the data will be divided into six partitions for increased performance. The first two examples are the same as they both load a file from the Linux filesystem, whereas the last one resides in HDFS:

sc.textFile("/data/spark/tweets.txt",6)
sc.textFile("file:///data/spark/tweets.txt",6)
sc.textFile("hdfs://server1:4014/data/spark/tweets.txt",6)
主站蜘蛛池模板: 醴陵市| 缙云县| 英德市| 靖宇县| 潍坊市| 邹平县| 日照市| 沁阳市| 苏州市| 酉阳| 永德县| 定远县| 澜沧| 南昌县| 肇源县| 前郭尔| 天柱县| 绩溪县| 乌兰县| 洞头县| 萨迦县| 浙江省| 锡林郭勒盟| 鲜城| 绥阳县| 汤阴县| 赞皇县| 石首市| 定陶县| 武胜县| 高碑店市| 文化| 丹阳市| 桂林市| 区。| 浑源县| 图木舒克市| 苏州市| 皋兰县| 广丰县| 阿克陶县|