官术网_书友最值得收藏!

The Big Data ecosystem

For a beginner, the landscape can be utterly confusing. There is vast arena of technologies and equally varied use cases. There is no single go-to solution; every use case has a custom solution and this widespread technology stack and lack of standardization is making Big Data a difficult path to tread for developers. There are a multitude of technologies that exist which can draw meaningful insight out of this magnitude of data.

Let's begin with the basics: the environment for any data analytics application creation should provide for the following:

  • Storing data
  • Enriching or processing data
  • Data analysis and visualization

If we get to specialization, there are specific Big Data tools and technologies available; for instance, ETL tools such as Talend and Pentaho; Pig batch processing, Hive, and MapReduce; real-time processing from Storm, Spark, and so on; and the list goes on. Here's the pictorial representation of the vast Big Data technology landscape, as per Forbes:

Source: http://www.forbes.com/sites/davefeinleib/2012/06/19/the-big-data-landscape/

It clearly depicts the various segments and verticals within the Big Data technology canvas:

  • Platforms such as Hadoop and NoSQL
  • Analytics such as HDP, CDH, EMC, Greenplum, DataStax, and more
  • Infrastructure such as Teradata, VoltDB, MarkLogic, and more
  • Infrastructure as a Service (IaaS) such as AWS, Azure, and more
  • Structured databases such as Oracle, SQL server, DB2, and more
  • Data as a Service (DaaS) such as INRIX, LexisNexis, Factual, and more

And, beyond that, we have a score of segments related to specific problem area such as Business Intelligence (BI), analytics and visualization, advertisement and media, log data and vertical apps, and so on.

主站蜘蛛池模板: 山东省| 松阳县| 桓仁| 微山县| 九江县| 射洪县| 青海省| 石家庄市| 通榆县| 葫芦岛市| 盘山县| 白水县| 青龙| 丰都县| 钟祥市| 丽水市| 潞城市| 错那县| 泰宁县| 密云县| 桑植县| 开鲁县| 防城港市| 东丰县| 四会市| 上高县| 嘉禾县| 桂平市| 黑龙江省| 芦山县| 达拉特旗| 水城县| 巴彦淖尔市| 布尔津县| 安化县| 长沙县| 蒲江县| 鹿邑县| 吉木萨尔县| 仙桃市| 贵州省|