- Mastering Apache Spark 2.x(Second Edition)
- Romeo Kienzler
- 182字
- 2021-07-02 18:55:28
Implicit schema discovery
One important aspect of the DataSource API is implicit schema discovery. For a subset of data sources, implicit schema discovery is possible. This means that while loading the data, not only are the individual columns discovered and made available in a DataFrame or Dataset, but also the column names and types.
Take a JSON file, for example. Column names are already explicitly present in the file. Due to the dynamic schema of JSON objects per default, the complete JSON file is read to discover all the possible column names. In addition, the column types are inferred and discovered during this parsing process.
Another example is the the Java Database Connectivity (JDBC) data source where the schema doesn't even need to be inferred but is directly read from the source database.
- 微服務設計(第2版)
- Java范例大全
- JavaScript修煉之道
- MongoDB for Java Developers
- WordPress Plugin Development Cookbook(Second Edition)
- Learning Selenium Testing Tools(Third Edition)
- Java項目實戰精編
- Nginx Lua開發實戰
- HTML5開發精要與實例詳解
- Building Serverless Architectures
- 計算機應用基礎項目化教程
- 零基礎C#學習筆記
- Software-Defined Networking with OpenFlow(Second Edition)
- Spark技術內幕:深入解析Spark內核架構設計與實現原理
- Mastering ASP.NET Web API