官术网_书友最值得收藏!

Chapter 1. Setting up Oozie

Oozie is a workflow scheduler system to run Apache Hadoop jobs. Oozie Workflow jobs are Directed Acyclic Graphs (DAGs) of actions. More information on DAG can be found at https://en.wikipedia.org/wiki/Directed_acyclic_graph. Actions tell what to do in the job. Oozie supports running jobs of various types such as Java, Map-reduce, Pig, Hive, Sqoop, Spark, and Distcp. The output of one action can be consumed by the next action to create a chain sequence.

Oozie has client-server architecture, in which we install the server for storing the jobs and using client we submit our jobs to the server.

In this chapter, we will learn how to install Oozie for learning purpose and in production. For learning purposes, we will build Oozie from the source code, and for production we will use Hadoop distribution by Hortonworks. Throughout the book, we will use Hortonworks single node virtual machine. If you are using a different Hadoop distribution, you should not worry at all. All distribution packages are the same for Oozie software, which is made by the Apache community (http://oozie.apache.org).

After reading this chapter, we will be able to:

  • Configure Oozie in Hortonworks distribution using Ambari
  • Install Oozie using the source code provided as tar ball by the Apache Oozie website
主站蜘蛛池模板: 淮滨县| 吴旗县| 文水县| 吉木萨尔县| 石嘴山市| 乌拉特中旗| 平顶山市| 始兴县| 保靖县| 罗平县| 湘西| 克什克腾旗| 东阳市| 岳西县| 嘉鱼县| 泉州市| 德州市| 富宁县| 永福县| 夏邑县| 额济纳旗| 江达县| 抚远县| 临城县| 类乌齐县| 商洛市| 汉寿县| 宜兰市| 莲花县| 手机| 天等县| 和顺县| 泌阳县| 沙洋县| 秦皇岛市| 天柱县| 河间市| 乾安县| 闽侯县| 琼海市| 麻江县|