官术网_书友最值得收藏!

Choosing the right Hadoop distribution

We have seen the evolution of Hadoop from a simple lab experiment tool to one of the most famous projects of Apache Software Foundation in the previous section. When the evolution started, many commercial implementations of Hadoop spawned. Today, we see more than 10 different implementations that exist in the market (Source). There is a debate about whether to go with full open source-based Hadoop or with a commercial Hadoop implementation. Each approach has its pros and cons. Let's look at the open source approach.

Pros of open source-based Hadoop include the following:

  • With a complete open source approach, you can take full advantage of community releases.
  • It's easier and faster to reach customers due to software being free. It also reduces the initial cost of investment.
  • Open source Hadoop supports open standards, making it easy to integrate with any system.

Cons of open source-based Hadoop include the following:

  • In the complete open source Hadoop scenario, it takes longer to build implementations compared to commercial software, due to lack of handy tools that speed up implementation
  • Supporting customers and fixing issues can become a tedious job due to the chaotic nature of the open source community
  • The roadmap of the product cannot be controlled/ginfluenced based on business needs

Given these challenges, many times, companies prefer to go with commercial implementations of Apache Hadoop. We will cover some of the key Hadoop distributions in this section.

主站蜘蛛池模板: 秭归县| 阜阳市| 郎溪县| 郸城县| 安龙县| 当涂县| 连平县| 师宗县| 天津市| 浙江省| 将乐县| 游戏| 兴山县| 西昌市| 梁河县| 文登市| 永兴县| 方城县| 湖州市| 陇西县| 上林县| 上蔡县| 临江市| 城市| 蛟河市| 鹤庆县| 安乡县| 和龙市| 原平市| 宁南县| 福建省| 巴塘县| 齐齐哈尔市| 万全县| 临西县| 惠水县| 峨山| 恩平市| 阿拉尔市| 自治县| 彭水|