官术网_书友最值得收藏!

HDFS I/O

An HDFS read operation from a client involves the following:

  1. The client requests NameNode to determine where the actual data blocks are stored for a given file.
  2. NameNode obliges by providing the block IDs and locations of the hosts (DataNode) where the data can be found.
  3. The client contacts DataNode with the respective block IDs to fetch the data from DataNode while preserving the order of the block files.

An HDFS write operation from a client involves the following:

  1. The client contacts NameNode to update the namespace with the filename and verify the necessary permissions.
  2. If the file exists, then NameNode throws an error; otherwise, it returns the client FSDataOutputStream which points to the data queue.
  3. The data queue negotiates with the NameNode to allocate new blocks on suitable DataNodes.
  4. The data is then copied to that DataNode, and, as per the replication strategy, the data is further copied from that DataNode to the rest of the DataNodes.
  5. It's important to note that the data is never moved through the NameNode as it would caused a performance bottleneck.
主站蜘蛛池模板: 温州市| 连州市| 扎囊县| 安仁县| 沐川县| 安化县| 库车县| 年辖:市辖区| 泰州市| 宿松县| 万州区| 沛县| 琼结县| 南通市| 淮阳县| 榆树市| 铜川市| 武夷山市| 个旧市| 大冶市| 杭锦后旗| 新沂市| 雷波县| 肥乡县| 恩施市| 天镇县| 东至县| 富民县| 伽师县| 台江县| 泸水县| 渝北区| 龙口市| 颍上县| 利津县| 许昌市| 甘谷县| 新乡市| 玉环县| 江都市| 宝山区|