官术网_书友最值得收藏!

Mechanics of the Query Optimizer

The next step in our journey toward writing efficient T-SQL queries is understanding how the SQL Server database engine optimizes a query; we will do so by exploring T-SQL query optimization internals and architecture, starting with the infamous cardinality estimation process and its building blocks. From there, we will understand how the Query Optimizer uses that information to produce a just-in-time, good-enough execution plan. This chapter will be referenced throughout this book, as we apply architectural topics to real-world uses.

Before we get started, it's important to have a common frame of reference about the following terms: 

  • Cardinality: Cardinality in a database is defined as the number of records, also called tuples, in each table or view.
  • FrequencyThis term represents the average number of occurrences of a given value in a column or column set. It's defined as the number of rows times the density.
  • Density: This term represents the average number of duplicate values in each column or column set, in other words, the average distribution of unique values in the data. It's defined as 1 divided by the number of distinct values.
  • SelectivityThis term represents the fraction of the row count that satisfies a given predicate, between zero and one. This is calculated as the predicate cardinality (Pc) divided by the table cardinality (Tc) multiplied by one hundred: (Pc ÷ Tc) × 100%. As the average number of duplicates decreases (the density), the selectivity of a value increases. For example, in a table that represents streets and cities in a country, many streets and cities have the same name, but each street and city combination has a unique zip code. An index on the zip code is more selective than an index on the street or city because the zip code has a much lower density than a street or city.
  • Statistics: Statistics are the metadata objects that we referred to in Chapter 2Understanding Query Processing. They maintain information on the distribution of data in a table or indexed view, over a specific column or column set. We will discuss the role of statistics in more detail in the Introducing the Cardinality Estimator section.
  • Histogram: This is a bucketized representation of the distribution of data in a specific column that is kept in a statistic object. These histograms hold aggregate information on the number of rows (cardinality) and distinct values (density) for up to 200 ranges of data values, named histogram steps. For any statistics object, the histogram is always created for the first column only. For multi-column statistics, this means that the histogram does not contain information about any additional column. 

In this chapter, we will cover the following topics:

  • Introducing the Cardinality Estimator
  • Understanding the query optimization workflow
主站蜘蛛池模板: 武城县| 唐山市| 景东| 祁连县| 贞丰县| 钦州市| 兴隆县| 宿迁市| 西贡区| 宁都县| 崇仁县| 宁远县| 玉溪市| 玉溪市| 泸溪县| 广丰县| 巢湖市| 诸城市| 高唐县| 调兵山市| 洞口县| 南岸区| 平顶山市| 金华市| 林州市| 翁牛特旗| 甘孜| 合阳县| 理塘县| 尉氏县| 娱乐| 长治市| 昂仁县| 定南县| 西乌珠穆沁旗| 逊克县| 盖州市| 澄城县| 崇阳县| 赤峰市| 南充市|