- Mastering PostgreSQL 10
- Hans Jürgen Sch?nig
- 285字
- 2021-06-30 19:03:54
Clustering tables
In PostgreSQL, there is a command called CLUSTER that allows us to rewrite a table in the desired order. It is possible to point to an index and store data in the same order as the index:
test=# \h CLUSTER Command: CLUSTER Description: cluster a table according to an index Syntax: CLUSTER [VERBOSE] table_name [ USING index_name ]
CLUSTER [VERBOSE]
The CLUSTER command has been around for many years and serves its purpose well. But, there are some things to consider before blindly running it on a production system:
- The CLUSTER command will lock the table while it is running. You cannot insert or modify data while CLUSTER is running. This might not be acceptable on a production system.
- Data can only be organized according to one index. You cannot order a table by postal code, name, ID, birthday, and so on, at the same time. It means that CLUSTER will make sense if there is a search criteria, which is used most of the time.
- Keep in mind that the example outlined in this book is more of a worst-case scenario. In reality, the performance difference between a clustered and a non- clustered table will depend on the workload, the amount of data retrieved, cache hit rates, and a lot more.
- The clustered state of a table will not be maintained as changes are made to a table during normal operations. Correlation will usually deteriorate as time goes by.
Here is an example of how to run the CLUSTER command:
test=# CLUSTER t_random USING idx_random; CLUSTER
Depending on the size of the table, the time needed to cluster will vary.