官术网_书友最值得收藏!

  • Mastering PostgreSQL 9.6
  • Hans Jurgen Schonig
  • 613字
  • 2021-07-09 19:57:19

Introducing operator classes

So far, the goal was to figure out what to index and to blindly apply an index on this column or on a group of columns. There is one assumption, however, that we have silently accepted to make this work. Up to now, we have been working on the assumption that the order in which data has to be sorted is a somewhat fixed constant. In reality, this assumption might not hold true. Sure, numbers will always be in the same order, but other kinds of data will most likely not have a predefined, fixed sort order.

To prove my point, I have compiled a real-world example. Take a look at the following two records:

1118 09 08 78
2345 01 05 77

My question now is: are those two rows ordered properly? They might because one comes before another. However, this is wrong because those two rows do have some hidden semantics. What you see here is two Austrian social security numbers. 09 08 78 actually means August 9, 1978 and 01 05 77 actually means May 1, 1977. The first four numbers consist of a checksum and some sort of auto-incremented three digit number. So in reality, 1977 comes before 1978 and we might consider swapping those two lines to achieve the desired sort order.

The problem is that PostgreSQL has no idea what those two rows actually mean. If a column is marked as text, PostgreSQL will apply the standard rules to sort text. If the column is marked as a number, PostgreSQL will apply the standard rules to sort numbers. Under no circumstances will it ever use something as odd as I've described. If you think that the facts I outlined previously are the only things to consider when processing those numbers, you are wrong. How many months does a year have? 12? Far from true. In the Austrian social security system, these numbers can hold up to 14 months? Why? Remember... three digits are simply an auto-increment value. The trouble is this: if an immigrant or a refugee has no valid paperwork and if his birthday is not known, he will be assigned an artificial birthday in the 13th month. During the Balkan wars in 1990, Austria offered asylum to over 115,000 refugees. Naturally, this three digit number was not enough and a 14th month was added. Now, which standard data type can handle this kind of COBOL-leftover from the early 1970 (that was when the layout of the social security number was introduced)? The answer is: none.

To handle special-purpose fields in a sane way, PostgreSQL offers operator classes:

test=# \h CREATE OPERATOR CLASS 
Command: CREATE OPERATOR CLASS
Description: define a new operator class
Syntax:
CREATE OPERATOR CLASS name [ DEFAULT ] FOR TYPE data_type
USING index_method [ FAMILY family_name ] AS
{ OPERATOR strategy_number operator_name [ ( op_type, op_type ) ] [ FOR SEARCH | FOR ORDER BY sort_family_name ]
| FUNCTION support_number [ ( op_type [ , op_type ] ) ] function_name ( argument_type [, ...] )
| STORAGE storage_type
} [, ... ]

An operator class will tell an index how to behave. Let's take a look at a standard binary tree. It can perform five operations:

The standard operator classes support the standard datatypes and standard operators we have been using throughout this book. If you want to handle social security numbers, it is necessary to come up with your own operators capable of providing you with the logic you need. Those custom operators can then be used to form an operator class, which is nothing more than a strategy passed to the index to configure how it should behave.

主站蜘蛛池模板: 松滋市| 翁牛特旗| 汪清县| 马鞍山市| 民丰县| 建平县| 黄冈市| 堆龙德庆县| 闽清县| 应城市| 墨脱县| 墨江| 尤溪县| 景洪市| 沛县| 兴和县| 卓资县| 民乐县| 延安市| 甘孜县| 瑞昌市| 邹城市| 莒南县| 乌兰县| 崇信县| 襄樊市| 青州市| 广水市| 明水县| 弥渡县| 鄂州市| 鄂尔多斯市| 绥棱县| 巴塘县| 诸暨市| 方正县| 宁强县| 泗水县| 龙陵县| 赣州市| 丹东市|