官术网_书友最值得收藏!

The Big Data ecosystem

For a beginner, the landscape can be utterly confusing. There is vast arena of technologies and equally varied use cases. There is no single go-to solution; every use case has a custom solution and this widespread technology stack and lack of standardization is making Big Data a difficult path to tread for developers. There are a multitude of technologies that exist which can draw meaningful insight out of this magnitude of data.

Let's begin with the basics: the environment for any data analytics application creation should provide for the following:

  • Storing data
  • Enriching or processing data
  • Data analysis and visualization

If we get to specialization, there are specific Big Data tools and technologies available; for instance, ETL tools such as Talend and Pentaho; Pig batch processing, Hive, and MapReduce; real-time processing from Storm, Spark, and so on; and the list goes on. Here's the pictorial representation of the vast Big Data technology landscape, as per Forbes:

Source: http://www.forbes.com/sites/davefeinleib/2012/06/19/the-big-data-landscape/

It clearly depicts the various segments and verticals within the Big Data technology canvas:

  • Platforms such as Hadoop and NoSQL
  • Analytics such as HDP, CDH, EMC, Greenplum, DataStax, and more
  • Infrastructure such as Teradata, VoltDB, MarkLogic, and more
  • Infrastructure as a Service (IaaS) such as AWS, Azure, and more
  • Structured databases such as Oracle, SQL server, DB2, and more
  • Data as a Service (DaaS) such as INRIX, LexisNexis, Factual, and more

And, beyond that, we have a score of segments related to specific problem area such as Business Intelligence (BI), analytics and visualization, advertisement and media, log data and vertical apps, and so on.

主站蜘蛛池模板: 永修县| 全州县| 丽江市| 浦城县| 丹江口市| 商水县| 青铜峡市| 原平市| 新绛县| 武乡县| 胶州市| 永丰县| 芜湖县| 湘潭市| 广昌县| 萝北县| 天长市| 攀枝花市| 陕西省| 武穴市| 凤翔县| 双桥区| 佛山市| 高阳县| 边坝县| 闵行区| 清镇市| 黄浦区| 扎囊县| 齐齐哈尔市| 洛扎县| 博野县| 本溪市| 湘潭县| 晋宁县| 东阿县| 江口县| 彝良县| 酒泉市| 阿拉善盟| 深圳市|