官术网_书友最值得收藏!

Configuring Oozie in Hortonworks distribution

In this section, we will learn how to configure Oozie inside Hortonworks Hadoop distribution using Ambari. We will configure the Oozie server to use a MySQL database instead of the default Derby database to store all job information.

We will use a virtual machine to learn how to configure Oozie in Hortonworks Hadoop distribution. Most of other distributions, such as Cloudera, Pivotal, and so on, have similar steps.

Let's start with the following steps:

  1. If you don't have VirtualBox on your machine, then download and install VirtualBox from https://www.virtualbox.org/wiki/Downloads.
  2. Download the Hortonworks single node virtual machine from http://hortonworks.com/hdp/downloads/. It will take 1-2 hours depending upon your Internet connection speed.

    Tip

    It is always good to store the virtual machine images in a common folder. For example, I have folder in my machine such as ~/dev/vm/. It makes virtual machine image management easier.

  3. After the download is complete, open the VirtualBox and click on File | Import Appliance:

    Import appliance

  4. Click on the Import Appliance button, browse to the place where you downloaded the virtual machine image, and then click on Continue.
  5. Wait till the VirtualBox imports the new machine.
  6. Once you can see the machine is imported, click on Start machine in the virtual machine console.
  7. On completion of boot process of the machine, you can log in to the Ambari dashboard by opening the URL http://127.0.0.1:8080 in your browser.
  8. Use the username as well as password as admin.

    It will take some time for all services to start up and report their status to Ambari. Once the system has reported the status, all services have a glance at the Ambari console. It is also a good idea to stop the services which we are not using to reduce the load on the system.

  9. In the Ambari dashboard, click on the link named Oozie on the left side. You can see there are two components for Oozie, Oozie Server and Oozie Client. Since we are using a single node cluster, we have both the server and client installed on the same machine. In the production environment, you will configure the Oozie server and clients separately on different machines. Using the client, we will submit the jobs to server. Before submitting the job, we will tell where the server is located using the OOZIE_URL variable.

    Tip

    To save time in manually specifying the Oozie server on the client machine every time, you can set the environment variable OOZIE_URL in your bash_profile or environment file depending on the operating system you use. You should say export OOZIE_URL=http://oozieserver:11000/oozie; in this book oozieserver will be localhost.

  10. Now click on the Config link at the top and we will configure the database as MySQL. The Oozie server will use MySQL to store the job information:

    Ambari Oozie configuration

  11. You may notice, at this moment, the server has been configured to use a Derby database. Derby is good for playing and testing, but not for running the production sever. We will configure it to use a MySQL-based database.
  12. Log in to the virtual machine using SSH as follows:
    $ ssh root@127.0.0.1 -p 2222
    

    The default password is hadoop.

  13. After you log in to the SSH session, log in to MySQL:
    $ mysql -u root
    
  14. Since this is a test virtual machine, the password is not configured. In production, you will be having password protection.
  15. At the MySQL prompt, execute the following SQL statements:
    CREATE USER 'oozie'@'%' IDENTIFIED BY 'hadoop';
    CREATE DATABASE oozie;
    GRANT ALL PRIVILEGES ON oozie.* TO 'oozie'@'%' WITH GRANT OPTION;
    

    The following output will be generated:

    Oozie database creation

  16. To make Oozie work with MySQL, we need to get driver for it. Let's download the MySQL JDBC driver from the MySQL JDBC jar download section. Extract the jar to a folder such as /root/mysql inside the virtual machine:
    $ cd ~/
    $ mkdir mysql
    $ cd mysql
    $ # Download the MySQL JDBC Driver
    $ wget http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.36.tar.gz
    $ # Extract tar
    $ tar -xvf mysql-connector-java-5.1.36.tar.gz
    $ # Tell Ambari that we got new MYSQL JDBC driver which it can use
    $ ambari-server setup --jdbc-db=mysql --jdbc-driver=/root/mysql/mysql-connector-java-5.1.36/mysql-connector-java-5.1.36-bin.jar
    
  17. In the Ambari dashboard, configure the MySQL database with the following details:
  18. In the Ambari dashboard page, click on Test Connection. If all is good, there should be a green tick. So, we have now configured the Oozie server to use MySQL database instead of Derby.
  19. Finally, to confirm that Oozie works properly, in another browser tab open the Oozie dashboard by entering the URL http://127.0.0.1:11000/oozie.

This completes the first section in which we learned how to configure Oozie for Hortonworks Ambari distribution.

主站蜘蛛池模板: 会泽县| 买车| 泸西县| 黄龙县| 苏尼特左旗| 榆社县| 双牌县| 平乡县| 瑞金市| 板桥市| 重庆市| 防城港市| 崇信县| 海伦市| 增城市| 马关县| 郴州市| 阆中市| 巴青县| 景宁| 平潭县| 滕州市| 沂南县| 湖口县| 灌云县| 柳林县| 五家渠市| 陇南市| 高安市| 河西区| 平乐县| 桂林市| 奉贤区| 延长县| 滦平县| 紫金县| 梅河口市| 武威市| 泗阳县| 洛宁县| 当雄县|