官术网_书友最值得收藏!

  • Hadoop Beginner's Guide
  • Garry Turkington
  • 324字
  • 2021-07-29 16:51:35

Time for action – starting Hadoop

Unlike the local mode of Hadoop, where all the components run only for the lifetime of the submitted job, with the pseudo-distributed or fully distributed mode of Hadoop, the cluster components exist as long-running processes. Before we use HDFS or MapReduce, we need to start up the needed components. Type the following commands; the output should look as shown next, where the commands are included on the lines prefixed by $:

  1. Type in the first command:
    $ start-dfs.sh
    starting namenode, logging to /home/hadoop/hadoop/bin/../logs/hadoop-hadoop-namenode-vm193.out
    localhost: starting datanode, logging to /home/hadoop/hadoop/bin/../logs/hadoop-hadoop-datanode-vm193.out
    localhost: starting secondarynamenode, logging to /home/hadoop/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-vm193.out
    
  2. Type in the second command:
    $ jps
    9550 DataNode
    9687 Jps
    9638 SecondaryNameNode
    9471 NameNode
    
  3. Type in the third command:
    $ hadoop dfs -ls /
    Found 2 items
    drwxr-xr-x - hadoop supergroup 0 2012-10-26 23:03 /tmp
    drwxr-xr-x - hadoop supergroup 0 2012-10-26 23:06 /user
    
  4. Type in the fourth command:
    $ start-mapred.sh 
    starting jobtracker, logging to /home/hadoop/hadoop/bin/../logs/hadoop-hadoop-jobtracker-vm193.out
    localhost: starting tasktracker, logging to /home/hadoop/hadoop/bin/../logs/hadoop-hadoop-tasktracker-vm193.out
    
  5. Type in the fifth command:
    $ jps
    9550 DataNode
    9877 TaskTracker
    9638 SecondaryNameNode
    9471 NameNode
    9798 JobTracker
    9913 Jps
    

What just happened?

The start-dfs.sh command, as the name suggests, starts the components necessary for HDFS. This is the NameNode to manage the filesystem and a single DataNode to hold data. The SecondaryNameNode is an availability aid that we'll discuss in a later chapter.

After starting these components, we use the JDK's jps utility to see which Java processes are running, and, as the output looks good, we then use Hadoop's dfs utility to list the root of the HDFS filesystem.

After this, we use start-mapred.sh to start the MapReduce components—this time the JobTracker and a single TaskTracker—and then use jps again to verify the result.

There is also a combined start-all.sh file that we'll use at a later stage, but in the early days it's useful to do a two-stage start up to more easily verify the cluster configuration.

主站蜘蛛池模板: 府谷县| 德保县| 习水县| 浦东新区| 社旗县| 简阳市| 什邡市| 武冈市| 昌邑市| 石景山区| 湾仔区| 正宁县| 新兴县| 高碑店市| 乾安县| 商洛市| 开平市| 读书| 大姚县| 云霄县| 左贡县| 石首市| 邮箱| 大余县| 原阳县| 青海省| 江源县| 衡阳市| 松桃| 大洼县| 武穴市| 仁怀市| 尼勒克县| 通河县| 鹤庆县| 麻栗坡县| 兴城市| 三门县| 邵东县| 聂拉木县| 朝阳县|