官术网_书友最值得收藏!

Choosing the right Hadoop distribution

We have seen the evolution of Hadoop from a simple lab experiment tool to one of the most famous projects of Apache Software Foundation in the previous section. When the evolution started, many commercial implementations of Hadoop spawned. Today, we see more than 10 different implementations that exist in the market (Source). There is a debate about whether to go with full open source-based Hadoop or with a commercial Hadoop implementation. Each approach has its pros and cons. Let's look at the open source approach.

Pros of open source-based Hadoop include the following:

  • With a complete open source approach, you can take full advantage of community releases.
  • It's easier and faster to reach customers due to software being free. It also reduces the initial cost of investment.
  • Open source Hadoop supports open standards, making it easy to integrate with any system.

Cons of open source-based Hadoop include the following:

  • In the complete open source Hadoop scenario, it takes longer to build implementations compared to commercial software, due to lack of handy tools that speed up implementation
  • Supporting customers and fixing issues can become a tedious job due to the chaotic nature of the open source community
  • The roadmap of the product cannot be controlled/ginfluenced based on business needs

Given these challenges, many times, companies prefer to go with commercial implementations of Apache Hadoop. We will cover some of the key Hadoop distributions in this section.

主站蜘蛛池模板: 方正县| 淄博市| 江西省| 织金县| 凉城县| 保靖县| 洛南县| 久治县| 承德市| 闵行区| 霍州市| 玉环县| 深州市| 新津县| 岫岩| 汝阳县| 班戈县| 崇信县| 福安市| 商洛市| 宜兴市| 富顺县| 太白县| 印江| 上虞市| 黄梅县| 祁阳县| 威远县| 莫力| 平利县| 银川市| 兰坪| 来凤县| 松阳县| 霍林郭勒市| 巴东县| 天等县| 西峡县| 安溪县| 六盘水市| 徐闻县|