- Mastering PostgreSQL 10
- Hans Jürgen Sch?nig
- 354字
- 2021-06-30 19:03:48
Making use of CREATE STATISTICS
CREATE STATISTICS is definitely one of my personal favorite features of PostgreSQL 10.0 because it allows consultants to help customers in many real-world situations. So, what is it all about? When you run SQL, the optimizer has to come up with clever decisions to speed up your queries. However, to do so, it has to rely heavily on estimates to figure out how much data a certain clause or a certain operation returns. Before version 10.0, PostgreSQL only had information about individual columns. Let's look at an example:
SELECT * FROM car WHERE vendor = 'Ford' AND model = 'Mini Clubman';
In version 9.6, PostgreSQL checks which fraction of the table matches Ford and which fraction matches Mini Clubman. Then, it would try to guess how many rows match both criteria. Remember, PostgreSQL 9.6 only has information about each column—it does not know that these columns are actually related. Therefore, it will simply multiply the odds of finding Ford with the odds of finding Mini Clubman and use this number. However, Ford does not produce a Mini Clubman instance—only BMW does. Therefore, the estimate is wrong. The same cross column correlation problem can happen in other cases too. The number of rows returned by a join might not be clear and the number of groups returned by a GROUP BY clause might be an issue.
Consider the following example:
SELECT gender, age, count(*) FROM children_born GROUP BY gender, age
The number of children born to people of a certain age will definitely depend on their age. The likelihood that some 30 year old women will have children is pretty high and therefore there will be a count. However, if you happen to be 98, you might not be so lucky and it is pretty unrealistic to have a baby, especially if you are a man (men tend to not give birth to children).
CREATE STATISTICS will give the optimizer a chance to gain deeper insights into what is going on by storing multivariate statistics. The idea is to help the optimizer handle functional dependencies.
- 大數(shù)據(jù)導(dǎo)論:思維、技術(shù)與應(yīng)用
- Seven NoSQL Databases in a Week
- 軟件架構(gòu)設(shè)計
- R Machine Learning By Example
- Mastering Machine Learning Algorithms
- 大數(shù)據(jù)平臺異常檢測分析系統(tǒng)的若干關(guān)鍵技術(shù)研究
- 悟透AutoCAD 2009完全自學(xué)手冊
- 工業(yè)機(jī)器人應(yīng)用案例集錦
- Excel 2007常見技法與行業(yè)應(yīng)用實(shí)例精講
- INSTANT Heat Maps in R:How-to
- 智能鼠原理與制作(進(jìn)階篇)
- 筆記本電腦電路分析與故障診斷
- Microsoft Dynamics CRM 2013 Marketing Automation
- Hands-On Microservices with C#
- 智能座艙之車載機(jī)器人交互設(shè)計與開發(fā)