官术网_书友最值得收藏!

The table distribution key and its role in a multipartitioned environment

The table distribution key is another element that can influence the distribution of table rows across existent partitions defined in a database partition group. Distribution keys can be created explicitly using DISTRIBUTION BY HASH (columns) directive inside table definition or they can be defined implicitly by DB2 using a series of column rules detailed in this recipe. Usually distribution column might be designed and used to improve query performance.

Getting ready

This recipe is strongly correlated with the preceding one; here, we will also use before and after pictures of how the data is distributed across database partitions.

How to do it...

For some of the queries, we will use Command Editor, because it offers better visibility.

Using the command line

  1. To find the distribution key columns for the COMM table, issue the following command:
    [db2instp@nodedb21 ~]$ db2 "select name from sysibm.syscolumns where tbname='COMM' and partkeyseq !=0" NAME CKEY 1 record(s) selected. [db2instp@nodedb21 ~]$ 

    Since we did not specify any distribution key at table creation, the primary key is used as the distribution key.

  2. To find the distribution of rows on partitions, by row numbers, before performing redistribution of table COMM, issue the following command (for better visibility run this statement in Command Center | Tools | Command Editor):
    select dbpartitionnum(CKEY) as PARTITION_NUMBER, count(CKEY) as NUMBER_OF_ROWS from comm group by dbpartitionnum(CKEY) order by dbpartitionnum(CKEY) 
    Using the command line
  3. After redistribution, issue the last statement again. Now we can see that rows are distributed slightly uniformly across partitions:
Using the command line

How it works...

Table rows partiticpating in distributed keys are assigned to partitions using a hashing algorithm that calculates their distribution across partitions. If you do not specify a distribution key at table creation, it will be created implicitly, as follows:

  • If the table has a primary key, it will be used as a distribution key
  • If the table does not have a primary key, the first column, whose data type is not LOB, LONG VARCHAR, LONG VARGRAPHIC, or XML, will be used as a partition key
  • If the table columns are not in this category, it means that a partition key cannot be defined, and the table can only be created in a table space that has been created on a single-partition database partition group

Distribution keys should be created to improve query performance in the following conditions:

  • Columns are frequently used in joins
  • Columns are frequently used in group by clause
  • Columns defined as primary keys or unique must be included in distribution keys
  • Create distribution keys for column used mostly in equality conditions

Practically the key columns are hashed and divided in 4095 or 32768 buckets depending on the distribution map used. Every map entry indicates where a specific row on which partition might be found.

There's more...

You cannot change the distribution key by altering the table, unless the table is spread across only one partition. To change the partition key for tables spanned on multiple partitions you should follow these steps:

  1. Export or copy table data in a staging table.
  2. Drop and recreate the table, with the new distribution key defined.
  3. Import or insert table data.

Table collocation

Distribution keys play a major role also in a method of table distribution called table collocation. Table collocation is another performance tuning technique found in multipartitioned database environments. A table is considered collocated when the following conditions are met:

  • Tables are placed in the same table space and are in the same partition groups.
  • The distribution keys have the same columns and same data types and are partition compatible.
主站蜘蛛池模板: 龙山县| 台南县| 汕头市| 政和县| 马尔康县| 红河县| 晋宁县| 河北省| 荆州市| 应城市| 嘉峪关市| 高阳县| 贺州市| 海淀区| 镇巴县| 出国| 崇明县| 筠连县| 陇南市| 鄯善县| 澄城县| 双峰县| 迁西县| 资源县| 永春县| 沾益县| 翁牛特旗| 化德县| 邹城市| 涞水县| 晋宁县| 呼图壁县| 古田县| 开江县| 平昌县| 永昌县| 奎屯市| 沙河市| 苏州市| 永新县| 将乐县|