- Mastering Apache Storm
- Ankit Jain
- 921字
- 2021-07-02 20:32:28
Deployment of the ZooKeeper cluster
In any distributed application, various processes need to coordinate with each other and share configuration information. ZooKeeper is an application that provides all these services in a reliable manner. Being a distributed application, Storm also uses a ZooKeeper cluster to coordinate various processes. All of the states associated with the cluster and the various tasks submitted to Storm are stored in ZooKeeper. This section describes how you can set up a ZooKeeper cluster. We will be deploying a ZooKeeper ensemble of three nodes that will handle one node failure. Following is the deployment diagram of the three node ZooKeeper ensemble:

In the ZooKeeper ensemble, one node in the cluster acts as the leader, while the rest are followers. If the leader node of the ZooKeeper cluster dies, then an election for the new leader takes places among the remaining live nodes, and a new leader is elected. All write requests coming from clients are forwarded to the leader node, while the follower nodes only handle the read requests. Also, we can't increase the write performance of the ZooKeeper ensemble by increasing the number of nodes because all write operations go through the leader node.
It is advised to run an odd number of ZooKeeper nodes, as the ZooKeeper cluster keeps working as long as the majority (the number of live nodes is greater than n/2, where n being the number of deployed nodes) of the nodes are running. So if we have a cluster of four ZooKeeper nodes (3 > 4/2; only one node can die), then we can handle only one node failure, while if we had five nodes (3 > 5/2; two nodes can die) in the cluster, then we can handle two node failures.
Steps 1 to 4 need to be performed on each node to deploy the ZooKeeper ensemble:
- Download the latest stable ZooKeeper release from the ZooKeeper site (http://zookeeper.apache.org/releases.html). At the time of writing, the latest version is ZooKeeper 3.4.6.
- Once you have downloaded the latest version, unzip it. Now, we set up the ZK_HOME environment variable to make the setup easier.
- Point the ZK_HOME environment variable to the unzipped directory. Create the configuration file, zoo.cfg, at the $ZK_HOME/conf directory using the following commands:
cd $ZK_HOME/conf touch zoo.cfg
- Add the following properties to the zoo.cfg file:
tickTime=2000 dataDir=/var/zookeeper clientPort=2181 initLimit=5 syncLimit=2 server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3.2888.3888
Here, zoo1, zoo2, and zoo3 are the IP addresses of the ZooKeeper nodes. The following are the definitions for each of the properties:
-
- tickTime: This is the basic unit of time in milliseconds used by ZooKeeper. It is used to send heartbeats, and the minimum session timeout will be twice the tickTime value.
- dataDir: This is the directory to store the in-memory database snapshots and transactional log.
- clientPort: This is the port used to listen to client connections.
- initLimit: This is the number of tickTime values needed to allow followers to connect and sync to a leader node.
- syncLimit: This is the number of tickTime values that a follower can take to sync with the leader node. If the sync does not happen within this time, the follower will be dropped from the ensemble.
The last three lines of the server.id=host:port:port format specify that there are three nodes in the ensemble. In an ensemble, each ZooKeeper node must have a unique ID number between 1 and 255. This ID is defined by creating a file named myid in the dataDir directory of each node. For example, the node with the ID 1 (server.1=zoo1:2888:3888) will have a myid file at directory /var/zookeeper with text 1 inside it.
For this cluster, create the myid file at three locations, shown as follows:
At zoo1 /var/zookeeper/myid contains 1 At zoo2 /var/zookeeper/myid contains 2 At zoo3 /var/zookeeper/myid contains 3
- Run the following command on each machine to start the ZooKeeper cluster:
bin/zkServer.sh start
Check the status of the ZooKeeper nodes by performing the following steps:
- Run the following command on the zoo1 node to check the first node's status:
bin/zkServer.sh status
The following information is displayed:
JMX enabled by default Using config: /home/root/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower
The first node is running in follower mode.
- Check the status of the second node by performing the following command:
bin/zkServer.sh status
The following information is displayed:
JMX enabled by default Using config: /home/root/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: leader
The second node is running in leader mode.
- Check the status of the third node by performing the following command:
bin/zkServer.sh status
The following information is displayed:
JMX enabled by default Using config: /home/root/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower
The third node is running in follower mode.
- Run the following command on the leader machine to stop the leader node:
bin/zkServer.sh stop
Now, check the status of the remaining two nodes by performing the following steps:
- Check the status of the first node using the following command:
bin/zkServer.sh status
The following information is displayed:
JMX enabled by default Using config: /home/root/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower
The first node is again running in follower mode.
- Check the status of the second node using the following command:
bin/zkServer.sh status
The following information is displayed:
JMX enabled by default Using config: /home/root/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: leader
The third node is elected as the new leader.
- Now, restart the third node with the following command:
bin/zkServer.sh status
This was a quick introduction to setting up ZooKeeper that can be used for development; however, it is not suitable for production. For a complete reference on ZooKeeper administration and maintenance, please refer to the online documentation at the ZooKeeper site at http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html.
- 高手是如何做產(chǎn)品設(shè)計(jì)的(全2冊(cè))
- Python概率統(tǒng)計(jì)
- 認(rèn)識(shí)編程:以Python語言講透編程的本質(zhì)
- Java 9 Programming Blueprints
- 軟件測(cè)試工程師面試秘籍
- Hands-On JavaScript High Performance
- D3.js 4.x Data Visualization(Third Edition)
- Learning JavaScript Data Structures and Algorithms
- Creating Stunning Dashboards with QlikView
- .NET Standard 2.0 Cookbook
- 小程序從0到1:微信全棧工程師一本通
- Mastering Gephi Network Visualization
- 數(shù)據(jù)科學(xué)中的實(shí)用統(tǒng)計(jì)學(xué)(第2版)
- 你真的會(huì)寫代碼嗎
- Android技術(shù)內(nèi)幕(系統(tǒng)卷)