- Mastering Apache Spark 2.x(Second Edition)
- Romeo Kienzler
- 235字
- 2021-07-02 18:55:33
Physical Execution Plan generation and selection
The Resolved and Optimized LEP is used to generate a large set of PEP candidates. PEPs are execution plans that have been completely resolved. This means that a PEP contains detailed instructions to generate the desired result. They are generated by so-called strategies. Strategies are used to optimize selection of join algorithms based on statistics. In addition, rules are executed for example to pipeline multiple operations on an RDD into a single, more complex operation. After a set of PEPs has been generated - they all will return the exact same result - the best one is chosen based on heuristics in order to minimize execution time.
In case the data source supports it, operations are pushed down to the source, namely for filtering (predicate) or selection of attributes (projection). This concept is explained in very detail on Chapter 2, Apache Spark SQL, in the section called Predicate push-down on smart data sources.
The main idea of predicate push-down is that parts of the AST are not executed by Apache Spark but by the data source itself. So for example filtering rows on column names can be done much more efficient by a relational or NoSQL database since it sits closer to the data and therefore can avoid data transfers between the database and Apache Spark. Also, the removal of unnecessary columns is a job done more effectively by the database.
- 流量的秘密:Google Analytics網(wǎng)站分析與優(yōu)化技巧(第2版)
- 程序設(shè)計與實踐(VB.NET)
- Python程序設(shè)計(第3版)
- Vue.js入門與商城開發(fā)實戰(zhàn)
- C語言程序設(shè)計實驗指導 (第2版)
- R語言與網(wǎng)絡(luò)輿情處理
- 區(qū)塊鏈技術(shù)進階與實戰(zhàn)(第2版)
- Scratch從入門到精通
- iOS開發(fā)項目化入門教程
- Java高并發(fā)編程詳解:深入理解并發(fā)核心庫
- Getting Started with Web Components
- Java Web動態(tài)網(wǎng)站開發(fā)(第2版·微課版)
- Mastering XenApp?
- Android開發(fā)進階實戰(zhàn):拓展與提升
- Android項目實戰(zhàn):博學谷