官术网_书友最值得收藏!

HDFS I/O

An HDFS read operation from a client involves the following:

  1. The client requests NameNode to determine where the actual data blocks are stored for a given file.
  2. NameNode obliges by providing the block IDs and locations of the hosts (DataNode) where the data can be found.
  3. The client contacts DataNode with the respective block IDs to fetch the data from DataNode while preserving the order of the block files.

An HDFS write operation from a client involves the following:

  1. The client contacts NameNode to update the namespace with the filename and verify the necessary permissions.
  2. If the file exists, then NameNode throws an error; otherwise, it returns the client FSDataOutputStream which points to the data queue.
  3. The data queue negotiates with the NameNode to allocate new blocks on suitable DataNodes.
  4. The data is then copied to that DataNode, and, as per the replication strategy, the data is further copied from that DataNode to the rest of the DataNodes.
  5. It's important to note that the data is never moved through the NameNode as it would caused a performance bottleneck.
主站蜘蛛池模板: 广宗县| 伽师县| 临海市| 微博| 平乡县| 凤山市| 闻喜县| 和平区| 岫岩| 乐山市| 封开县| 濉溪县| 阿克| 丰原市| 社旗县| 金门县| 普兰县| 马关县| 泰顺县| 松潘县| 新竹县| 政和县| 渭南市| 江阴市| 曲阜市| 环江| 新沂市| 泉州市| 舒城县| 延吉市| 淮阳县| 玛多县| 卓尼县| 深泽县| 九寨沟县| 垦利县| 柳河县| 松溪县| 石林| 剑阁县| 增城市|