- Mastering Apache Spark 2.x(Second Edition)
- Romeo Kienzler
- 100字
- 2021-07-02 18:55:27
Processing the text files
Using SparkContext, it is possible to load a text file in RDD using the textFile method. Additionally, the wholeTextFile method can read the contents of a directory to RDD. The following examples show you how a file, based on the local filesystem (file://) or HDFS (hdfs://), can be read to a Spark RDD. These examples show you that the data will be divided into six partitions for increased performance. The first two examples are the same as they both load a file from the Linux filesystem, whereas the last one resides in HDFS:
sc.textFile("/data/spark/tweets.txt",6)
sc.textFile("file:///data/spark/tweets.txt",6)
sc.textFile("hdfs://server1:4014/data/spark/tweets.txt",6)
推薦閱讀
- Java Web開發(fā)學(xué)習(xí)手冊(cè)
- Android應(yīng)用程序開發(fā)與典型案例
- Mastering QGIS
- Rake Task Management Essentials
- Java游戲服務(wù)器架構(gòu)實(shí)戰(zhàn)
- Bootstrap Essentials
- JavaScript動(dòng)態(tài)網(wǎng)頁開發(fā)詳解
- Bootstrap 4:Responsive Web Design
- MATLAB 2020從入門到精通
- Oracle 18c 必須掌握的新特性:管理與實(shí)戰(zhàn)
- Nagios Core Administration Cookbook(Second Edition)
- Zabbix Performance Tuning
- 大學(xué)計(jì)算機(jī)基礎(chǔ)
- FFmpeg開發(fā)實(shí)戰(zhàn):從零基礎(chǔ)到短視頻上線
- Mastering Concurrency Programming with Java 9(Second Edition)