- Hadoop Beginner's Guide
- Garry Turkington
- 324字
- 2021-07-29 16:51:35
Time for action – starting Hadoop
Unlike the local mode of Hadoop, where all the components run only for the lifetime of the submitted job, with the pseudo-distributed or fully distributed mode of Hadoop, the cluster components exist as long-running processes. Before we use HDFS or MapReduce, we need to start up the needed components. Type the following commands; the output should look as shown next, where the commands are included on the lines prefixed by $
:
- Type in the first command:
$ start-dfs.sh starting namenode, logging to /home/hadoop/hadoop/bin/../logs/hadoop-hadoop-namenode-vm193.out localhost: starting datanode, logging to /home/hadoop/hadoop/bin/../logs/hadoop-hadoop-datanode-vm193.out localhost: starting secondarynamenode, logging to /home/hadoop/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-vm193.out
- Type in the second command:
$ jps 9550 DataNode 9687 Jps 9638 SecondaryNameNode 9471 NameNode
- Type in the third command:
$ hadoop dfs -ls / Found 2 items drwxr-xr-x - hadoop supergroup 0 2012-10-26 23:03 /tmp drwxr-xr-x - hadoop supergroup 0 2012-10-26 23:06 /user
- Type in the fourth command:
$ start-mapred.sh starting jobtracker, logging to /home/hadoop/hadoop/bin/../logs/hadoop-hadoop-jobtracker-vm193.out localhost: starting tasktracker, logging to /home/hadoop/hadoop/bin/../logs/hadoop-hadoop-tasktracker-vm193.out
- Type in the fifth command:
$ jps 9550 DataNode 9877 TaskTracker 9638 SecondaryNameNode 9471 NameNode 9798 JobTracker 9913 Jps
What just happened?
The start-dfs.sh
command, as the name suggests, starts the components necessary for HDFS. This is the NameNode to manage the filesystem and a single DataNode to hold data. The SecondaryNameNode is an availability aid that we'll discuss in a later chapter.
After starting these components, we use the JDK's jps
utility to see which Java processes are running, and, as the output looks good, we then use Hadoop's dfs
utility to list the root of the HDFS filesystem.
After this, we use start-mapred.sh
to start the MapReduce components—this time the JobTracker and a single TaskTracker—and then use jps
again to verify the result.
There is also a combined start-all.sh
file that we'll use at a later stage, but in the early days it's useful to do a two-stage start up to more easily verify the cluster configuration.
- 現(xiàn)代測控電子技術(shù)
- 輕輕松松自動化測試
- 三菱FX3U/5U PLC從入門到精通
- 精通MATLAB神經(jīng)網(wǎng)絡(luò)
- Ansible Quick Start Guide
- 激光選區(qū)熔化3D打印技術(shù)
- 工業(yè)機器人實操進階手冊
- Linux系統(tǒng)下C程序開發(fā)詳解
- 運動控制系統(tǒng)(第2版)
- 系統(tǒng)安裝、維護與數(shù)據(jù)備份技巧
- 傳感器應用技術(shù)
- NetSuite ERP for Administrators
- 單片機原理、接口及應用系統(tǒng)設(shè)計
- CAD應用程序開發(fā)詳解
- 信息技術(shù)基礎(chǔ)與應用