官术网_书友最值得收藏!

Quota configuration

In a multitenancy cluster, it is important to control the utilization both in terms of HDFS space, memory, and CPU utilization. In this recipe, we will be looking at how we can restrict a user or a project from using more than the allotted HDFS space.

Getting ready

Make sure that there is a running cluster, and that the user is already well versed in the recipes that we have looked at so far.

How to do it...

  1. Connect to Namenode and change the user to hadoop.
  2. Create a directory named projects on HDFS, as shown in the following screenshot:
    How to do it...
  3. By default, there is no quota configured on any directory.
  4. To see what options can be set on the projects directory, use the following command:
    $ hadoop fs -count -q /projects
  5. The two leftmost fields show the namespace and disk space quota, which currently is not set, as shown in the following screenshot:
    How to do it...
  6. To set the namespace quota, which will define how many inodes can be allocated for this projects directory, enter the following code. Inodes is the same as what we have in Linux. It is a data structure that defines a filesystem object:
    $ hdfs dfsadmin -setQuota 100 /projects
  7. To set the disk space quota, which will define how many blocks can be allocated for this projects directory, enter the following code:
    $ hdfs dfsadmin -setSpaceQuota 4G /projects
  8. With the preceding commands, we have set a namespace quota of 100 and a disk space quota of 4 GB, as shown in the following screenshot:
    How to do it...
  9. To remove the quota on a directory, the user can use the commands as shown in the following screenshot:
    How to do it...

How it works...

In steps 1 through 9, we configured the quota to restrict any one directory from using the entire cluster space. The namespace controls how many files can be created in that path and the space quota tells us what the total size will be.

So now it is up to the user whether they want to create just a single file of 4 GB or 10 smaller files. This way, we are forcing the user not to create small files, as, if they do, they will run out of namespace quota.

The following commands show the total space available in the cluster, but the first command takes into account the replication factor:

$ hadoop fs -count -q /
9223372036854775807 9223372036854774414
$ hdfs dfsadmin -report | head
Configured Capacity: 378046439424 (352.08 GB)
Present Capacity: 357588590592 (333.03 GB)
DFS Remaining: 355012579328 (330.63 GB)
DFS Used: 2576011264 (2.40 GB)

Note

The user quota can be configured by restricting users to particular directories by using the Linux kind of permissions. It is easier to group users together and assign group permissions to directories.

主站蜘蛛池模板: 襄城县| 郸城县| 依安县| 辽宁省| 论坛| 禹州市| 黄石市| 廉江市| 苗栗市| 德安县| 陆川县| 元阳县| 托克逊县| 平利县| 凤庆县| 巴楚县| 罗城| 银川市| 长治县| 城步| 疏附县| 衡阳县| 辛集市| 延长县| 商城县| 分宜县| 汉中市| 寿宁县| 桦甸市| 嘉黎县| 北碚区| 卫辉市| 惠水县| 安泽县| 泰州市| 琼结县| 沙坪坝区| 怀仁县| 黑龙江省| 霸州市| 民乐县|