官术网_书友最值得收藏!

Tuning query-related parameters

In this recipe, we will talk about the query planning related parameters and the associated tuning aspects.

How to do it...

The following are the query planning related parameters that usually require tuning:

  • random_page_cost
  • seq_page_cost
  • effective_cache_size
  • work_mem
  • constraint_exclusion

These parameters can be set in the postgresql.conf configuration file.

How it works...

random_page_cost: This parameter is basically used to estimate the cost of a random page fetch in abstract cost units. The default value of this parameter is 4.0. Random page cost is basically used to represent the coefficient between the cost of looking up one row via sequential scans against the cost of looking up a single row inpidually using random access, that is, disk seeks. This factor influences the query planner's decision to use indexes instead of a table scan while executing queries. Reducing this value relative to the seq_page_cost parameter will cause the system to prefer index scans. If, however, this value is increased, it would make index scans appear expensive from a cost perspective.

seq_page_cost: This parameter is used to estimate the cost of a sequential page fetch in abstract cost units. The default value of this parameter is 1.0 and this may need to be lowered for caching effects. For situations where the database is cached entirely in RAM, this value needs to be lowered. Always set random_page_cost to be greater than or equal to seq_page_cost.

effective_cache_size: This parameter setting is used to provide an estimate of how much memory is available for disk caching by the operating system and within the database itself after considering and accounting for what is being used by the OS itself and other applications. This parameter does not allocate any memory, rather it serves as a guideline as to how much memory you expect to be available in the RAM, as well as in the PostgreSQL shared buffer. This helps the PostgreSQL query planner to figure out whether the query plans it is considering for query execution would fit in the system memory or not. If this is set too low, indexes may not be used for executing queries the way you would expect.

Let us assume that there is around 1.5 GB of system RAM on your Linux machine, the value of shared_buffers is set to 32 MB, and effective_cache_size is set to 850 MB. Now, if a running query needs around 700 MB of a dataset, PostgreSQL would estimate that all the data required should be available in memory and would hence opt for an aggressive plan from the context of optimization, involving heavier index usage and merge joins. But if the effective cache size is set to around 300 MB, the query planner will opt for the sequential scan. The rule of thumb is that if shared_buffers is set to a quarter of the total RAM, setting effective_cache_size to one half of the available RAM would be a conservative setting and setting it to around 75% of the total RAM would be an aggressive setting, but still reasonable.

work_mem: This parameter defines the amount of memory to be used by internal sort operations and hash tables before switching to temporary disk files. When a running query needs to sort data, an estimate is provided by the database as to how much data is involved and it will compare it to the work_mem parameter. If it is larger, it will write out all the data and use a disk-based sort mechanism instead of reverting to in-memory sort, which would be much slower than the memory-based sort. If there are a lot of complex sorts involved, and if you have good memory available, then increasing the value of the work_mem parameter would allow PostgreSQL to do in-memory sorts, which will be faster than disk-based ones. The amount of memory as specified by the value of the work_mem parameter gets applied for each sort operation done by each database session and complex queries can use multiple working memory-based sort buffers. For instance, if the value of work_mem is set to 100 MB, and if there are around 40 database sessions submitting queries, one would end up using around 4 GB of real system memory. The catch is that one cannot predict the number of sorts one client session may end up doing and thereby theoretically work_mem is a per sort parameter instead of being a per client one effectively indicating that the memory available from the use of work_mem is unbound, where there are a number of clients doing a large number of sorts concurrently. From the point of view of optimization to arrive at the optimum setting for the work_mem parameter, we must first consider how much free system memory is available after the allocation of shared_buffers further pided by the value of max_connections and then considering a fraction of that figure albeit half of that would be an assertive value for the work_mem parameter. One needs to set this value explicitly as the default value is too low for in-memory sorts.

constraint_exclusion: If in PostgreSQL you are using partitioned tables that utilize constraints, then enabling the constraint_exclusion parameter allows the query planner to ignore partitions that cannot have the data being searched for when that can be proven. In earlier PostgreSQL versions before 8.4, this value was set to off, which meant that unless toggled on partitioned tables, this will not work as expected. The new default value since version 8.4 has been to set this value to partition, in which case the query planner will examine constraints for inheritance child subtables and UNION ALL queries. If this value is turned on, it will examine constraints on all tables and this will impose extra planning overhead even for simple queries and thus there will be a performance overhead.

主站蜘蛛池模板: 临西县| 海伦市| 阿勒泰市| 阜南县| 泸西县| 资源县| 屏边| 杭锦后旗| 霍林郭勒市| 班玛县| 英山县| 河源市| 正阳县| 育儿| 湟源县| 喀喇| 富平县| 千阳县| 崇州市| 托克逊县| 连江县| 思茅市| 宣城市| 红安县| 思茅市| 白银市| 奈曼旗| 板桥市| 济源市| 台东市| 西城区| 东乌珠穆沁旗| 会理县| 孟州市| 平远县| 岫岩| 元阳县| 轮台县| 读书| 湛江市| 南漳县|