- Mastering Apache Spark 2.x(Second Edition)
- Romeo Kienzler
- 110字
- 2021-07-02 18:55:29
Using SQL
After using the previous Scala example to create a data frame from a JSON input file on HDFS, we can now define a temporary table based on the data frame and run SQL against it.
The following example shows you the temporary table called washing_flat being defined and a row count being created using count(*):

The schema for this data was created on the fly (inferred). This is a very nice function of the Apache Spark DataSource API that has been used when reading the JSON file from HDFS using the SparkSession object. However, if you want to specify the schema on your own, you can do so.
推薦閱讀
- Vue.js 3.x快速入門
- Learning Java Functional Programming
- Apache ZooKeeper Essentials
- C/C++算法從菜鳥到達人
- FreeSWITCH 1.6 Cookbook
- 零基礎入門學習Python
- Visual C#通用范例開發金典
- C語言程序設計實驗指導 (第2版)
- Building Serverless Web Applications
- 零基礎看圖學ScratchJr:少兒趣味編程(全彩大字版)
- Instant Apache Camel Messaging System
- Visual C++程序設計與項目實踐
- WordPress Search Engine Optimization(Second Edition)
- Python數據科學實踐指南
- Laravel Design Patterns and Best Practices