mg不朽情缘5滴血图片

書名： IBM DB2 9.7 Advanced Administration Cookbook
作者名： Adrian Neagu Robert Pelletier
本章字數： 596字
更新時間： 2021-08-20 15:33:23

The table distribution key and its role in a multipartitioned environment

The table distribution key is another element that can influence the distribution of table rows across existent partitions defined in a database partition group. Distribution keys can be created explicitly using DISTRIBUTION BY HASH (columns) directive inside table definition or they can be defined implicitly by DB2 using a series of column rules detailed in this recipe. Usually distribution column might be designed and used to improve query performance.

Getting ready

This recipe is strongly correlated with the preceding one; here, we will also use before and after pictures of how the data is distributed across database partitions.

How to do it...

For some of the queries, we will use Command Editor, because it offers better visibility.

Using the command line

To find the distribution key columns for the COMM table, issue the following command:
```
[db2instp@nodedb21 ~]$ db2 "select name from sysibm.syscolumns where tbname='COMM' and partkeyseq !=0" NAME CKEY 1 record(s) selected. [db2instp@nodedb21 ~]$ 
```
Since we did not specify any distribution key at table creation, the primary key is used as the distribution key.
To find the distribution of rows on partitions, by row numbers, before performing redistribution of table COMM, issue the following command (for better visibility run this statement in Command Center | Tools | Command Editor):
```
select dbpartitionnum(CKEY) as PARTITION_NUMBER, count(CKEY) as NUMBER_OF_ROWS from comm group by dbpartitionnum(CKEY) order by dbpartitionnum(CKEY) 
```
After redistribution, issue the last statement again. Now we can see that rows are distributed slightly uniformly across partitions:

How it works...

Table rows partiticpating in distributed keys are assigned to partitions using a hashing algorithm that calculates their distribution across partitions. If you do not specify a distribution key at table creation, it will be created implicitly, as follows:

If the table has a primary key, it will be used as a distribution key
If the table does not have a primary key, the first column, whose data type is not LOB, LONG VARCHAR, LONG VARGRAPHIC, or XML, will be used as a partition key
If the table columns are not in this category, it means that a partition key cannot be defined, and the table can only be created in a table space that has been created on a single-partition database partition group

Distribution keys should be created to improve query performance in the following conditions:

Columns are frequently used in joins
Columns are frequently used in group by clause
Columns defined as primary keys or unique must be included in distribution keys
Create distribution keys for column used mostly in equality conditions

Practically the key columns are hashed and divided in 4095 or 32768 buckets depending on the distribution map used. Every map entry indicates where a specific row on which partition might be found.

There's more...

You cannot change the distribution key by altering the table, unless the table is spread across only one partition. To change the partition key for tables spanned on multiple partitions you should follow these steps:

Export or copy table data in a staging table.
Drop and recreate the table, with the new distribution key defined.
Import or insert table data.

Table collocation

Distribution keys play a major role also in a method of table distribution called table collocation. Table collocation is another performance tuning technique found in multipartitioned database environments. A table is considered collocated when the following conditions are met:

Tables are placed in the same table space and are in the same partition groups.
The distribution keys have the same columns and same data types and are partition compatible.

官术网_书友最值得收藏!

IBM DB2 9.7 Advanced Administration Cookbook