官术网_书友最值得收藏!

Summary

In this chapter, we discussed the non-functional requirements for data storage solutions. It has become clear that a data lake, which is an evolution of a data warehouse, consists of multiple layers that have their own requirements and thus technology. We have discussed the key requirements for a raw data store where primarily flat files need to be stored in a robust way, for a historical database where temporal information is saved, and for analytics data stores where fast querying is necessary. Furthermore, we have explained the requirements for a streaming data engine and for a model development environment. In all cases, requirements management is an ongoing process in an AI project. Rather than setting all the requirements in stone at the start of the project, architects and developers should be agile, revisiting and revising the requirements after every iteration.

In the next chapter, we will connect the layers of the architecture we have explored in this chapter by creating a data processing pipeline that transforms data from the raw data layer to the historical data layer and to the analytics layer. We will do this to ensure that all the data has been prepared for use in machine learning models. We will also cover data preparation for streaming data scenarios.

主站蜘蛛池模板: 阳东县| 渑池县| 宁陕县| 双柏县| 玉林市| 宝兴县| 固安县| 高唐县| 瓦房店市| 南木林县| 富阳市| 黄平县| 太谷县| 永定县| 永平县| 平利县| 积石山| 临朐县| 皮山县| 南开区| 涟源市| 翁源县| 安乡县| 邻水| 石棉县| 镇安县| 灌阳县| 资兴市| 临泉县| 天镇县| 英吉沙县| 墨竹工卡县| 邮箱| 兰州市| 上犹县| 孝昌县| 卢湾区| 巴马| 开平市| 陵川县| 河池市|