官术网_书友最值得收藏!

Summary

In this chapter, we discussed the non-functional requirements for data storage solutions. It has become clear that a data lake, which is an evolution of a data warehouse, consists of multiple layers that have their own requirements and thus technology. We have discussed the key requirements for a raw data store where primarily flat files need to be stored in a robust way, for a historical database where temporal information is saved, and for analytics data stores where fast querying is necessary. Furthermore, we have explained the requirements for a streaming data engine and for a model development environment. In all cases, requirements management is an ongoing process in an AI project. Rather than setting all the requirements in stone at the start of the project, architects and developers should be agile, revisiting and revising the requirements after every iteration.

In the next chapter, we will connect the layers of the architecture we have explored in this chapter by creating a data processing pipeline that transforms data from the raw data layer to the historical data layer and to the analytics layer. We will do this to ensure that all the data has been prepared for use in machine learning models. We will also cover data preparation for streaming data scenarios.

主站蜘蛛池模板: 宜兴市| 双桥区| 虹口区| 定日县| 宜川县| 运城市| 镇安县| 大宁县| 武鸣县| 光泽县| 丹凤县| 东阿县| 亳州市| 海南省| 扎鲁特旗| 淄博市| 札达县| 二连浩特市| 石阡县| 白银市| 油尖旺区| 普格县| 右玉县| 靖安县| 林州市| 东源县| 来凤县| 饶平县| 瑞丽市| 个旧市| 盐源县| 临朐县| 花莲市| 小金县| 罗田县| 赣州市| 深泽县| 长宁县| 定边县| 茌平县| 海城市|