官术网_书友最值得收藏!

Reducing contention with concurrent indexes

When administering a PostgreSQL installation, we will eventually need to create new tables and indexes. In the case of new indexes, the table is locked in shared exclusive access mode for the duration of the creation process, blocking any insert, update, or delete activity. This both prevents inconsistencies, and allows the database to modify the table structure to reflect the new index.

Unfortunately, this process is fundamentally incompatible with maintaining a highly available server. While building the index, PostgreSQL needs to examine every valid table row, which means loading it from the disk into memory. For large or active tables, this can cause excessive strain on the system. Other database activities will reduce available disk bandwidth, and the required lock will block all modifications of data in that table. Combined, this can lead to a table being locked for a very long time.

Beginning with PostgreSQL 8.2, indexes can be created concurrently with other activities. This means PostgreSQL constructs the index in the background and only requests an exclusive lock that is long enough to attach it to the table. Early after its introduction, some DBAs felt reluctant to use it and have not changed their evaluation of its safety as it matured.

This may seem trivial as the feature has been around for a very long time, but not enough new administrators know about this functionality. Using it properly and knowing the caveats can avert several DBA headaches.

Getting ready

We just need to find an index to create. For the purposes of this discussion, we may also want to create a small pgbench database for demonstration purposes. Execute the following commands as the postgres user to build a sufficient sample:

createdb pgbench
pgbench -i -s 200 pgbench

How to do it...

Follow these steps to test concurrent index creation:

  1. Connect to the pgbench database and execute the following command as a superuser or the postgres user:
    CREATE INDEX CONCURRENTLY idx_account_bid
    
        ON pgbench_accounts (bid);
  2. In another connection, attempt to execute the following insert before the preceding command completes:
    INSERT INTO pgbench_accounts
    
    VALUES (50000000, 100, 15000, 'testing');

How it works...

By adding the CONCURRENTLY modifier, PostgreSQL will begin the process of building an index. While it does this, it also tracks the incoming insert, update, and delete activities to include them in the new index.

In the connection where we invoked the CREATE INDEX statement, we will not see a prompt again until PostgreSQL finishes building the index. So, how can we tell it apart from any regular index creation? One of the reasons we built an example was to prove that concurrency is present. The INSERT statement in the second connection should succeed before the index is complete. The process is the same for a production PostgreSQL instance. Any incoming writes to a table undergoing a concurrent index creation will complete normally until the final lock is necessary.

There's more...

While concurrent indexes are very useful, they have some very important elements we need to consider.

No transactions

As of PostgreSQL 9.3, concurrent index creation cannot take place inside a transaction. Why not? Remember that the process needs to look inside all the incoming transactions that could modify the table being indexed. PostgreSQL normally never allows what most experienced DBAs know as dirty reads of uncommitted data. As a consequence, concurrent indexes must be built outside of a transaction by internal database mechanisms.

One at a time

As concurrent index creation is not transaction safe, PostgreSQL will only build one at a time. Some enterprising DBAs have circumvented this limitation by building a queue system to send concurrent index-creation requests until the queue is empty. More advanced PostgreSQL installations may want to consider a similar system to utilize concurrent indexes extensively.

Danger with OLTP use

Concurrent indexes are not a panacea; they still follow rules for lock acquisition. Specifically, PostgreSQL cannot acquire a lock to attach the index so long as any earlier transactions are still running. While it waits for the lock, any new transactions that need to modify the table contents will also wait. This feedback loop of waits can quickly consume all available client connections on a busy OLTP system.

It's best to avoid this situation by following the normal index-creation protocol on OLTP systems: only create indexes when the volume is low. We can also massively reduce the risk by avoiding long-running transactions that could potentially block the final lock request. OLTP systems should have few of these in any case.

See also

PostgreSQL has an excellent manual page discussing indexes and concurrency. Please refer to this page for more information:

主站蜘蛛池模板: 左贡县| 宜都市| 彭阳县| 长葛市| 曲阜市| 定远县| 伽师县| 巴里| 许昌县| 桦川县| 镇沅| 米易县| 双鸭山市| 桐乡市| 凉山| 钦州市| 琼中| 海丰县| 通州区| 东明县| 贺兰县| 绥德县| 穆棱市| 永兴县| 营山县| 吐鲁番市| 上饶市| 西平县| 巩留县| 武安市| 凌海市| 桐城市| 涡阳县| 崇阳县| 景洪市| 永胜县| 德清县| 公主岭市| 黄山市| 滦平县| 昌平区|