官术网_书友最值得收藏!

MapReduce and Spark

MapReduce is a technique for performing aggregate processing on large amounts of data in parallel; it's a particularly common technique in data analytics applications. Cassandra does not offer built-in MapReduce capabilities, but it can be integrated with Hadoop in order to perform MapReduce operations across Cassandra data sets, or Spark for real-time data analysis. The DataStax enterprise product provides integration with both of these tools out of the box.
Spark is a fast, distributed, and expressive computational engine used for large-scale data processing similar to MapReduce. It is much more efficient than MapReduce and runs with resource managers such as Mesos and Yarn. It can read data from various sources such as Hadoop or Cassandra or even streams such as Kafka. DataStax provides a Spark-Cassandra connector to load data from Cassandra into Spark and run batch computations on the data.

主站蜘蛛池模板: 嘉定区| 五华县| 沾益县| 嘉祥县| 巴楚县| 孙吴县| 宿迁市| 正阳县| 庆云县| 定兴县| 通渭县| 临夏市| 济南市| 东辽县| 澳门| 平果县| 新闻| 巴塘县| 宜黄县| 板桥市| 上虞市| 左贡县| 岱山县| 芦山县| 浦江县| 新邵县| 烟台市| 顺义区| 来凤县| 井陉县| 克东县| 宜章县| 华宁县| 垫江县| 潞西市| 潮安县| 康马县| 新竹县| 夏河县| 离岛区| 邢台县|