官术网_书友最值得收藏!

HDFS I/O

An HDFS read operation from a client involves the following:

  1. The client requests NameNode to determine where the actual data blocks are stored for a given file.
  2. NameNode obliges by providing the block IDs and locations of the hosts (DataNode) where the data can be found.
  3. The client contacts DataNode with the respective block IDs to fetch the data from DataNode while preserving the order of the block files.

An HDFS write operation from a client involves the following:

  1. The client contacts NameNode to update the namespace with the filename and verify the necessary permissions.
  2. If the file exists, then NameNode throws an error; otherwise, it returns the client FSDataOutputStream which points to the data queue.
  3. The data queue negotiates with the NameNode to allocate new blocks on suitable DataNodes.
  4. The data is then copied to that DataNode, and, as per the replication strategy, the data is further copied from that DataNode to the rest of the DataNodes.
  5. It's important to note that the data is never moved through the NameNode as it would caused a performance bottleneck.
主站蜘蛛池模板: 枝江市| 永仁县| 苍梧县| 苏尼特左旗| 抚远县| 和平县| 晋州市| 塔城市| 师宗县| 云安县| 清丰县| 双鸭山市| 旬阳县| 荃湾区| 宜阳县| 湖州市| 潼关县| 台安县| 修文县| 财经| 库伦旗| 阿巴嘎旗| 长葛市| 观塘区| 新源县| 固原市| 东城区| 称多县| 禹州市| 英山县| 汤原县| 保康县| 长葛市| 舟曲县| 娱乐| 建阳市| 女性| 屏东市| 昌平区| 谷城县| 彩票|