官术网_书友最值得收藏!

Chapter 1. Setting up Oozie

Oozie is a workflow scheduler system to run Apache Hadoop jobs. Oozie Workflow jobs are Directed Acyclic Graphs (DAGs) of actions. More information on DAG can be found at https://en.wikipedia.org/wiki/Directed_acyclic_graph. Actions tell what to do in the job. Oozie supports running jobs of various types such as Java, Map-reduce, Pig, Hive, Sqoop, Spark, and Distcp. The output of one action can be consumed by the next action to create a chain sequence.

Oozie has client-server architecture, in which we install the server for storing the jobs and using client we submit our jobs to the server.

In this chapter, we will learn how to install Oozie for learning purpose and in production. For learning purposes, we will build Oozie from the source code, and for production we will use Hadoop distribution by Hortonworks. Throughout the book, we will use Hortonworks single node virtual machine. If you are using a different Hadoop distribution, you should not worry at all. All distribution packages are the same for Oozie software, which is made by the Apache community (http://oozie.apache.org).

After reading this chapter, we will be able to:

  • Configure Oozie in Hortonworks distribution using Ambari
  • Install Oozie using the source code provided as tar ball by the Apache Oozie website
主站蜘蛛池模板: 景泰县| 久治县| 苗栗市| 龙门县| 咸宁市| 福州市| 勃利县| 高雄县| 长岛县| 海宁市| 平谷区| 聊城市| 墨玉县| 宝丰县| 桃园市| 深州市| 固始县| 娱乐| 都昌县| 固原市| 抚州市| 荣成市| 兴海县| 陵水| 三门县| 十堰市| 福安市| 纳雍县| 伽师县| 沭阳县| 高密市| 响水县| 乌兰察布市| 三门峡市| 兰考县| 巨野县| 北辰区| 泾源县| 昌都县| 临高县| 花垣县|