- Hands-On Artificial Intelligence for IoT
- Amita Kapoor
- 285字
- 2021-07-02 14:02:02
HDFS
HDFS is a popular storage and access method for storing and retrieving data files for IoT solutions. The HDFS format can hold large amounts of data in a reliable and scalable manner. Its design is based on the Google File System (https://ai.google/research/pubs/pub51). HDFS splits individual files into fixed-size blocks that are stored on machines across the cluster. To ensure reliability, it replicates the file blocks and distributes them across the cluster; by default, the replication factor is 3. HDFS has two main architecture components:
- The first, NodeName, stores the metadata for the entire filesystem, such as filenames, their permissions, and the location of each block of each file.
- The second, DataNode (one or more), is where file blocks are stored. It performs Remote Procedure Calls (RPCs) using protobufs.
RPC is a protocol that one program can use to request a service from a program located on another computer on a network without having to know the network's details. A procedure call is also sometimes known as a function call or a subroutine call.
There are many options for programmatically accessing HDFS in Python, such as snakebite, pyarrow, hdfs3, pywebhdfs, hdfscli, and so on. In this section, we will focus mainly on libraries that provide native RPC client interfaces and work with Python 3.
- 智能傳感器技術與應用
- Div+CSS 3.0網頁布局案例精粹
- Python Artificial Intelligence Projects for Beginners
- CorelDRAW X4中文版平面設計50例
- Creo Parametric 1.0中文版從入門到精通
- Photoshop CS3圖像處理融會貫通
- 3D Printing for Architects with MakerBot
- Hybrid Cloud for Architects
- Excel 2007技巧大全
- 零起點學西門子S7-200 PLC
- 精通LabVIEW程序設計
- Linux系統管理員工具集
- Windows安全指南
- 智能制造系統及關鍵使能技術
- 電腦故障排除與維護終極技巧金典