- Mastering Apache Spark 2.x(Second Edition)
- Romeo Kienzler
- 182字
- 2021-07-02 18:55:28
Implicit schema discovery
One important aspect of the DataSource API is implicit schema discovery. For a subset of data sources, implicit schema discovery is possible. This means that while loading the data, not only are the individual columns discovered and made available in a DataFrame or Dataset, but also the column names and types.
Take a JSON file, for example. Column names are already explicitly present in the file. Due to the dynamic schema of JSON objects per default, the complete JSON file is read to discover all the possible column names. In addition, the column types are inferred and discovered during this parsing process.
Another example is the the Java Database Connectivity (JDBC) data source where the schema doesn't even need to be inferred but is directly read from the source database.
- Web程序設計及應用
- Cocos2d Cross-Platform Game Development Cookbook(Second Edition)
- Extending Jenkins
- C# 7 and .NET Core Cookbook
- Learn Type:Driven Development
- 自己動手實現Lua:虛擬機、編譯器和標準庫
- AngularJS Web Application Development Blueprints
- C語言程序設計
- 精通軟件性能測試與LoadRunner實戰(第2版)
- Ray分布式機器學習:利用Ray進行大模型的數據處理、訓練、推理和部署
- Hands-On Enterprise Automation with Python.
- 算法訓練營:提高篇(全彩版)
- Building RESTful Python Web Services
- Learning Probabilistic Graphical Models in R
- OpenCV 3 Blueprints