- Mastering PostgreSQL 9.6
- Hans Jurgen Schonig
- 465字
- 2021-07-09 19:57:22
Adding additional indexes
Since PostgreSQL 9.6, there has been an easy way to deploy entirely new index types as extensions. This is pretty cool because if those index types provided by PostgreSQL are not enough, it is possible to add additional ones serving precisely your purpose. The instruction to do that is CREATE ACCESS METHOD:
test=# \h CREATE ACCESS METHOD
Command: CREATE ACCESS METHOD
Description: define a new access method
Syntax:
CREATE ACCESS METHOD name
TYPE access_method_type
HANDLER handler_function
Don't worry too much about this command—just in case you ever deploy your own index type, it will come as a ready to use extension.
One of those extensions implement bloom filters. Bloom filters are probabilistic data structures. They sometimes return too many rows but never too few. Therefore, a bloom filter is a good method to pre-filter data.
How does it work? A bloom filter is defined on a couple of columns. A bitmask is calculated based on the input values, which is then compared to your query. The upside of a bloom filter is that you can index as many columns as you want. The downside is that the entire bloom filter has to be read. Of course, the bloom filter is smaller than the underlying data and so it is, in many cases, very beneficial.
To use bloom filters, just activate the extension, which is part of the PostgreSQL contrib package:
test=# CREATE EXTENSION bloom;
CREATE EXTENSION
As stated previously, the idea behind a bloom filter is that it allows you to index as many columns as you want. In many real-world applications, the challenge is to index many columns without knowing which combinations the user will actually need at runtime. In the case of a large table, it is totally impossible to create standard B-tree indexes on, say, 80 fields or more. A bloom filter might be an alternative in this case:
test=# CREATE TABLE t_bloom (x1 int, x2 int, x3 int, x4 int, x5 int, x6 int, x7 int);
CREATE TABLE
Creating the index is easy:
test=# CREATE INDEX idx_bloom ON t_bloom (x1, x2, x3, x4, x5, x6, x7);
CREATE INDEX
If sequential scans are turned off, the index can be seen in action:
test=# explain SELECT * FROM t_bloom WHERE x5 = 9 AND x3 = 7; QUERY PLAN
-------------------------------------------------------------------------
Bitmap Heap Scan on t_bloom (cost=18.50..22.52 rows=1 width=28)
Recheck Cond: ((x3 = 7) AND (x5 = 9))
-> Bitmap Index Scan on idx_bloom (cost=0.00..18.50 rows=1 width=0)
Index Cond: ((x3 = 7) AND (x5 = 9))
Note that I have queried a combination of random columns; they are not related to the actual order in the index. The bloom filter will still be beneficial.
If you are interested in bloom filters, consider checking out the following website: https://en.wikipedia.org/wiki/Bloom_filter.
- 虛擬儀器設(shè)計(jì)測(cè)控應(yīng)用典型實(shí)例
- Visual C# 2008開發(fā)技術(shù)詳解
- Java Web整合開發(fā)全程指南
- 單片機(jī)C語言應(yīng)用100例
- 教育機(jī)器人的風(fēng)口:全球發(fā)展現(xiàn)狀及趨勢(shì)
- Mastering pfSense
- 玩機(jī)器人 學(xué)單片機(jī)
- Cloudera Hadoop大數(shù)據(jù)平臺(tái)實(shí)戰(zhàn)指南
- FANUC工業(yè)機(jī)器人虛擬仿真教程
- DynamoDB Applied Design Patterns
- JSP通用范例開發(fā)金典
- Office 2010輕松入門
- 從實(shí)踐中學(xué)嵌入式Linux操作系統(tǒng)
- 仿龜機(jī)器人的設(shè)計(jì)與制作
- Practical Autodesk AutoCAD 2021 and AutoCAD LT 2021