- Mastering Apache Spark 2.x(Second Edition)
- Romeo Kienzler
- 127字
- 2021-07-02 18:55:32
Managing temporary views with the catalog API
Since Apache Spark 2.0, the catalog API is used to create and remove temporary views from an internal meta store. This is necessary if you want to use SQL, because it basically provides the mapping between a virtual table name and a DataFrame or Dataset.
Internally, Apache Spark uses the org.apache.spark.sql.catalyst.catalog.SessionCatalog class to manage temporary views as well as persistent tables.
Temporary views are stored in the SparkSession object, as persistent tables are stored in an external metastore. The abstract base class org.apache.spark.sql.catalyst.catalog.ExternalCatalog is extended for various meta store providers. One already exists for using Apache Derby and another one for the Apache Hive metastore, but anyone could extend this class and make Apache Spark use another metastore as well.
- The Complete Rust Programming Reference Guide
- 深入淺出Java虛擬機(jī):JVM原理與實(shí)戰(zhàn)
- Mastering Scientific Computing with R
- TypeScript圖形渲染實(shí)戰(zhàn):基于WebGL的3D架構(gòu)與實(shí)現(xiàn)
- 面向?qū)ο蟪绦蛟O(shè)計(jì)(Java版)
- Unreal Engine 4 Shaders and Effects Cookbook
- 從零開始學(xué)C語言
- Solr Cookbook(Third Edition)
- LabVIEW虛擬儀器入門與測(cè)控應(yīng)用100例
- Spring Boot+MVC實(shí)戰(zhàn)指南
- Python Data Science Cookbook
- Python機(jī)器學(xué)習(xí)算法與應(yīng)用
- Unity 5.X從入門到精通
- 算法圖解
- 百萬在線:大型游戲服務(wù)端開發(fā)