官术网_书友最值得收藏!

Landing–staging–target scenario

As mentioned in the preceding section, sometimes, actual data sources are not too reliable. This is why we need to add an extra layer to our architectures to defend against most uncertainties coming from data sources. This extra layer is called landing. The landing database is a zone used for only one thing: to catch data from data sources with no respect to their schema stability, accessibility, or data quality. The following screenshot shows a complete architecture containing the landing database:

As seen in the preceding screenshot, the landing database is added to the staging/target database architecture. The landing database plays a vital role in scenarios in which data sources vary. As an example, let's take a set of CSV files stored on an FTP site or web services with XML or JSON responses. The schema of such data is not reliable enough, so the landing database could help us to recognize schema changes between two loads from data sources, and it can also help us to manipulate data from non-relational data sources.

Previous sections provided a description of typical database architectures for data transformations. Certain databases in staging-target and landing-staging-target scenarios could not be separated in isolated instances nor isolated databases; every part of data transformation could be created simply as a separated schema in a single database. This decision depends on several factors, such as owned resources, licences, or security requirements.

The question is how to develop data movements between certain data sources and databases. We have a wide set of options available for this.

主站蜘蛛池模板: 江北区| 景泰县| 哈密市| 松原市| 营口市| 兴宁市| 周宁县| 特克斯县| 高淳县| 道孚县| 区。| 吕梁市| 肃宁县| 宁强县| 尖扎县| 潼关县| 马龙县| 察雅县| 黑河市| 江阴市| 乐清市| 柘城县| 麻城市| 连平县| 乌兰浩特市| 西充县| 乌拉特后旗| 泾阳县| 甘南县| 绥化市| 荥经县| 南岸区| 盖州市| 临夏县| 昌黎县| 德阳市| 札达县| 黔江区| 景泰县| 浦江县| 上蔡县|