官术网_书友最值得收藏!

Spark components

As discussed earlier in this chapter, the main philosophy behind Spark is to provide a unified engine for creating different types of big data applications. Spark provides a variety of libraries to work with batch analytics, streaming, machine learning, and graph analysis.

It is not as if these kinds of processing were never done before Spark, but for every new big data problem, there was a new tool in the market; for example, for batch analysis, we had MapReduce, Hive, and Pig. For Streaming, we had Apache Storm, for machine learning, we had Mahout. Although these tools solve the problems that they are designed for, each of them requires a learning curve. This is where Spark brings advantages. Spark provides a unified stack for solving all of these problems. It has components that are designed for processing all kinds of big data. It also provides many libraries to read or write different kinds of data such as JSON, CSV, and Parquet.

Here is an example of a Spark stack:

Spark stack

Having a unified stack brings lots of advantages. Let's look at some of the advantages:

  • First is code sharing and reusability. Components developed by the data engineering team can easily be integrated by the data science team to avoid code redundancy. 
  • Secondly,  there is always a new tool coming in the market to solve a different big data usecase. Most of the developers struggle to learn new tools and gain expertise in order to use them efficiently. With Spark, developers just have to learn the basic concepts which allows developers to work on different big data use cases.
  • Thirdly, its unified stack gives great power to the developers to explore new ideas without installing new tools.

The following diagram provides a high-level overview of different big-data applications powered by Spark:

Spark use cases
主站蜘蛛池模板: 蒙城县| 泸州市| 兴业县| 新晃| 手游| 依安县| 遂昌县| 绍兴县| 方山县| 陇西县| 册亨县| 新宁县| 常熟市| 海晏县| 兖州市| 潍坊市| 乐亭县| 平南县| 阳城县| 楚雄市| 珲春市| 赤壁市| 柞水县| 天等县| 中超| 蓝山县| 德江县| 深水埗区| 龙口市| 沐川县| 五原县| 昌图县| 时尚| 长垣县| 全州县| 重庆市| 石河子市| 东海县| 锡林浩特市| 芦山县| 东莞市|