官术网_书友最值得收藏!

An introduction to data warehousing

Before actually getting started, there is an important concept that needs to be covered and that is data warehouse modeling. The data warehouse is a concept that has been around since the 1970s, it is a central database built for reporting that integrates data from disparate sources to a common location and a common schema. It also removes the direct connection between the source systems and the historical data. This means that if you change your source system, you will retain historical records and can decommission the old system.

The schema in a data warehouse in many cases is built in Third normal form (3NF) to ensure that data is only stored once to minimize the storage cost and make it easier to maintain.

Tip

For more information about 3NF, refer to http://en.wikipedia.org/wiki/3NF.

This is a strategy commonly referenced as an Inmon data warehouse coming from the father of data warehousing, Bill Inmon. Also have a look at http://inmoninstitute.com/about/index.

There is a drawback with 3NF, that is, it is not a model built for querying; it requires many joins in the queries to write the simplest report. So a simpler model of the data is often necessary. A common model built for querying is the dimensional model defined by Ralph Kimball and is available at http://www.kimballgroup.com/. The dimensional model defines those things that you want to measure and should be stored in a fact table. Around the fact table, you will have multiple dimension tables containing the things that you would like to slice the facts by.

In the preceding figure, you have the fact table defined containing each order row, you will have the measures such as order quantity and sales amount, as well as the keys referencing Product, Seller, Time, and Customer on each row of the table. The dimension tables contain attributes such as year, month, and day in the Time table, Customer name, address, and customer number in the Customer table. This design is often referenced to as a star schema.

Analysis Services is built with multidimensional modeling in mind and works best with data warehouses or data marts that use this technique; however, as you will later see, there is a possibility to work with all kinds of schemas in the database through the use of data source views in Analysis Services.

During the course of this book we will work with the AdventureWorks2012DW database. This is a data warehouse built for the fictitious company called Adventure Works Cycles. They have a data warehouse built using dimensional modeling with several fact tables containing the things that they want to measure in their business.

Tip

For a background on Adventure Works and their business, refer to the following description:

http://technet.microsoft.com/en-us/library/ms124825(v=SQL.100).aspx

An in-depth description of their data warehouse can be found at http://technet.microsoft.com/en-us/library/ms124623(v=sql.100).aspx.

主站蜘蛛池模板: 会同县| 尼木县| 东方市| 桂林市| 武平县| 达拉特旗| 泰兴市| 五家渠市| 阳朔县| 光泽县| 和林格尔县| 积石山| 彭州市| 南平市| 仲巴县| 桂东县| 屏东县| 泽库县| 张家港市| 黔江区| 凤翔县| 莎车县| 连城县| 陆良县| 大厂| 搜索| 崇礼县| 乾安县| 汨罗市| 富川| 呼图壁县| 锦州市| 云龙县| 吉木乃县| 中西区| 安龙县| 柳江县| 嵊州市| 富阳市| 蓬溪县| 广饶县|