- Mastering Apache Spark 2.x(Second Edition)
- Romeo Kienzler
- 144字
- 2021-07-02 18:55:30
Applying SQL table joins
In order to examine the table joins, we have created some additional test data. Let's consider banking data. We have an account table called account_data.json and a customer data table called client_data.json. So let's take a look at the two JSON files.
First, let's look at client.json:

Next, let's look at account.json:

As you can see, clientId of account.json refers to id of client.json. Therefore, we are able to join the two files but before we can do this, we have to load them:
var client = spark.read.json("client.json")
var account = spark.read.json("account.json")
Then we register these two DataFrames as temporary tables:
client.createOrReplaceTempView("client")
account.createOrReplaceTempView("account")
Let's query these individually, client first:

Then follow it up with account:

Now we can join the two tables:

Finally, let's calculate some aggregation on the amount of money that every client has on all his accounts:

- C++程序設計(第3版)
- Effective C#:改善C#代碼的50個有效方法(原書第3版)
- React Native Cookbook
- 從0到1:HTML+CSS快速上手
- Java開發入行真功夫
- Java程序設計與實踐教程(第2版)
- 從零開始學C語言
- Android玩家必備
- JavaScript程序設計:基礎·PHP·XML
- Java Web開發實例大全(基礎卷) (軟件工程師開發大系)
- Learning Android Application Testing
- OpenCV Android Programming By Example
- 前端架構設計
- Learning RSLogix 5000 Programming
- 流程讓管理更高效:流程管理全套方案制作、設計與優化