官术网_书友最值得收藏!

Mechanics of the Query Optimizer

The next step in our journey toward writing efficient T-SQL queries is understanding how the SQL Server database engine optimizes a query; we will do so by exploring T-SQL query optimization internals and architecture, starting with the infamous cardinality estimation process and its building blocks. From there, we will understand how the Query Optimizer uses that information to produce a just-in-time, good-enough execution plan. This chapter will be referenced throughout this book, as we apply architectural topics to real-world uses.

Before we get started, it's important to have a common frame of reference about the following terms: 

  • Cardinality: Cardinality in a database is defined as the number of records, also called tuples, in each table or view.
  • FrequencyThis term represents the average number of occurrences of a given value in a column or column set. It's defined as the number of rows times the density.
  • Density: This term represents the average number of duplicate values in each column or column set, in other words, the average distribution of unique values in the data. It's defined as 1 divided by the number of distinct values.
  • SelectivityThis term represents the fraction of the row count that satisfies a given predicate, between zero and one. This is calculated as the predicate cardinality (Pc) divided by the table cardinality (Tc) multiplied by one hundred: (Pc ÷ Tc) × 100%. As the average number of duplicates decreases (the density), the selectivity of a value increases. For example, in a table that represents streets and cities in a country, many streets and cities have the same name, but each street and city combination has a unique zip code. An index on the zip code is more selective than an index on the street or city because the zip code has a much lower density than a street or city.
  • Statistics: Statistics are the metadata objects that we referred to in Chapter 2Understanding Query Processing. They maintain information on the distribution of data in a table or indexed view, over a specific column or column set. We will discuss the role of statistics in more detail in the Introducing the Cardinality Estimator section.
  • Histogram: This is a bucketized representation of the distribution of data in a specific column that is kept in a statistic object. These histograms hold aggregate information on the number of rows (cardinality) and distinct values (density) for up to 200 ranges of data values, named histogram steps. For any statistics object, the histogram is always created for the first column only. For multi-column statistics, this means that the histogram does not contain information about any additional column. 

In this chapter, we will cover the following topics:

  • Introducing the Cardinality Estimator
  • Understanding the query optimization workflow
主站蜘蛛池模板: 梁山县| 鄯善县| 洛阳市| 西丰县| 宁国市| 绥阳县| 康马县| 武隆县| 福鼎市| 大厂| 巢湖市| 手游| 富民县| 霍邱县| 台南市| 宣威市| 昆山市| 清涧县| 融水| 来安县| 锡林郭勒盟| 伊川县| 东山县| 白水县| 当涂县| 镇江市| 湘潭市| 宁远县| 娄底市| 屏南县| 玉屏| 秦安县| 玉田县| 乌恰县| 花莲县| 玉树县| 永春县| 垣曲县| 肇源县| 通道| 凌海市|