- Mastering Apache Spark 2.x(Second Edition)
- Romeo Kienzler
- 100字
- 2021-07-02 18:55:27
Processing the text files
Using SparkContext, it is possible to load a text file in RDD using the textFile method. Additionally, the wholeTextFile method can read the contents of a directory to RDD. The following examples show you how a file, based on the local filesystem (file://) or HDFS (hdfs://), can be read to a Spark RDD. These examples show you that the data will be divided into six partitions for increased performance. The first two examples are the same as they both load a file from the Linux filesystem, whereas the last one resides in HDFS:
sc.textFile("/data/spark/tweets.txt",6)
sc.textFile("file:///data/spark/tweets.txt",6)
sc.textFile("hdfs://server1:4014/data/spark/tweets.txt",6)
推薦閱讀
- Extending Jenkins
- The Modern C++ Challenge
- Learning ArcGIS Pro 2
- Vue.js前端開發(fā)基礎(chǔ)與項目實戰(zhàn)
- 程序員修煉之道:通向務(wù)實的最高境界(第2版)
- Java Web程序設(shè)計任務(wù)教程
- Oracle Exadata專家手冊
- Visual FoxPro程序設(shè)計習(xí)題集及實驗指導(dǎo)(第四版)
- 微服務(wù)架構(gòu)深度解析:原理、實踐與進(jìn)階
- ArcGIS for Desktop Cookbook
- Java EE實用教程
- Offer來了:Java面試核心知識點精講(框架篇)
- Design Patterns and Best Practices in Java
- WCF全面解析
- Qt編程快速入門