Summary
In this chapter, we discussed the non-functional requirements for data storage solutions. It has become clear that a data lake, which is an evolution of a data warehouse, consists of multiple layers that have their own requirements and thus technology. We have discussed the key requirements for a raw data store where primarily flat files need to be stored in a robust way, for a historical database where temporal information is saved, and for analytics data stores where fast querying is necessary. Furthermore, we have explained the requirements for a streaming data engine and for a model development environment. In all cases, requirements management is an ongoing process in an AI project. Rather than setting all the requirements in stone at the start of the project, architects and developers should be agile, revisiting and revising the requirements after every iteration.
In the next chapter, we will connect the layers of the architecture we have explored in this chapter by creating a data processing pipeline that transforms data from the raw data layer to the historical data layer and to the analytics layer. We will do this to ensure that all the data has been prepared for use in machine learning models. We will also cover data preparation for streaming data scenarios.
- Android NDK Game Development Cookbook
- Linux運維之道(第2版)
- 分布式系統與一致性
- 計算機組裝與維修技術
- Learning Microsoft Cognitive Services
- 超炫的35個Arduino制作項目
- CPU設計實戰:LoongArch版
- Hands-On Embedded Programming with C++17
- 新型復印機·傳真機維修數據速查寶典
- 微處理器及控制電路識圖
- 101 UX Principles
- Arduino項目開發:物聯網應用
- 數字噴墨與應用
- Blender 2.6 Cycles:Materials and Textures Cookbook
- 中國SOA最佳應用及云計算融合實踐