- Mastering Apache Spark 2.x(Second Edition)
- Romeo Kienzler
- 127字
- 2021-07-02 18:55:32
Managing temporary views with the catalog API
Since Apache Spark 2.0, the catalog API is used to create and remove temporary views from an internal meta store. This is necessary if you want to use SQL, because it basically provides the mapping between a virtual table name and a DataFrame or Dataset.
Internally, Apache Spark uses the org.apache.spark.sql.catalyst.catalog.SessionCatalog class to manage temporary views as well as persistent tables.
Temporary views are stored in the SparkSession object, as persistent tables are stored in an external metastore. The abstract base class org.apache.spark.sql.catalyst.catalog.ExternalCatalog is extended for various meta store providers. One already exists for using Apache Derby and another one for the Apache Hive metastore, but anyone could extend this class and make Apache Spark use another metastore as well.
- Getting Started with Gulp(Second Edition)
- Modular Programming with Python
- Node.js Design Patterns
- Python金融數(shù)據(jù)分析
- Implementing Cisco Networking Solutions
- PHP 編程從入門到實踐
- 移動互聯(lián)網(wǎng)軟件開發(fā)實驗指導
- Instant PHP Web Scraping
- C專家編程
- 編程改變生活:用Python提升你的能力(進階篇·微課視頻版)
- 人人都能開發(fā)RPA機器人:UiPath從入門到實戰(zhàn)
- INSTANT Apache Hive Essentials How-to
- Java Web開發(fā)基礎(chǔ)與案例教程
- MATLAB 2020 GUI程序設(shè)計從入門到精通
- TypeScript全棧開發(fā)