- Mastering Apache Spark 2.x(Second Edition)
- Romeo Kienzler
- 100字
- 2021-07-02 18:55:27
Processing the text files
Using SparkContext, it is possible to load a text file in RDD using the textFile method. Additionally, the wholeTextFile method can read the contents of a directory to RDD. The following examples show you how a file, based on the local filesystem (file://) or HDFS (hdfs://), can be read to a Spark RDD. These examples show you that the data will be divided into six partitions for increased performance. The first two examples are the same as they both load a file from the Linux filesystem, whereas the last one resides in HDFS:
sc.textFile("/data/spark/tweets.txt",6)
sc.textFile("file:///data/spark/tweets.txt",6)
sc.textFile("hdfs://server1:4014/data/spark/tweets.txt",6)
推薦閱讀
- Learn ECMAScript(Second Edition)
- 數據庫原理及應用(Access版)第3版
- 造個小程序:與微信一起干件正經事兒
- ASP.NET Core Essentials
- Programming ArcGIS 10.1 with Python Cookbook
- Designing Hyper-V Solutions
- MariaDB High Performance
- Learn WebAssembly
- Securing WebLogic Server 12c
- Redis Essentials
- C#應用程序設計教程
- 利用Python進行數據分析
- Java Web開發就該這樣學
- OpenCV with Python By Example
- ExtJS Web應用程序開發指南第2版