- Apache Spark 2.x for Java Developers
- Sourav Gulati Sumit Kumar
- 151字
- 2021-07-02 19:01:52
Defining HDFS
The Hadoop Distributed File System (HDFS), as the name suggests, is a distributed filesystem based on the lines of the Google File System written in Java. In practice, HDFS resembles closely any other UNIX filesystem with support for common file operations such as ls, cp, rm, du, cat, and so on. However what makes HDFS stand out, despite its simplicity, is its mechanism to handle node failure in the Hadoop cluster without effectively changing the search time for accessing stored files. The HDFS cluster consists of two major components: DataNodes and NameNode.
HDFS has a unique way of storing data on HDFS clusters (cheap commodity networked commodity computers). It splits the regular file in smaller chunks called blocks and then makes an exact number of copies of such chunks depending on the replication factor for that file. After that, it copies such chunks to different DataNodes of the cluster.
- JIRA 7 Administration Cookbook(Second Edition)
- 數據結構簡明教程(第2版)微課版
- 微信公眾平臺開發:從零基礎到ThinkPHP5高性能框架實踐
- JavaScript:Moving to ES2015
- Red Hat Enterprise Linux Troubleshooting Guide
- 從Excel到Python數據分析:Pandas、xlwings、openpyxl、Matplotlib的交互與應用
- 人人都能開發RPA機器人:UiPath從入門到實戰
- Software-Defined Networking with OpenFlow(Second Edition)
- 網頁設計與制作
- Python數據預處理技術與實踐
- Java EE 7 Development with WildFly
- Unity 5 Game Optimization
- 算法訓練營:海量圖解+競賽刷題(入門篇)
- ASP.NET本質論
- Game Development Patterns and Best Practices