官术网_书友最值得收藏!

MapReduce and Spark

MapReduce is a technique for performing aggregate processing on large amounts of data in parallel; it's a particularly common technique in data analytics applications. Cassandra does not offer built-in MapReduce capabilities, but it can be integrated with Hadoop in order to perform MapReduce operations across Cassandra data sets, or Spark for real-time data analysis. The DataStax enterprise product provides integration with both of these tools out of the box.
Spark is a fast, distributed, and expressive computational engine used for large-scale data processing similar to MapReduce. It is much more efficient than MapReduce and runs with resource managers such as Mesos and Yarn. It can read data from various sources such as Hadoop or Cassandra or even streams such as Kafka. DataStax provides a Spark-Cassandra connector to load data from Cassandra into Spark and run batch computations on the data.

主站蜘蛛池模板: 平凉市| 大化| 睢宁县| 穆棱市| 揭西县| 苍山县| 古浪县| 辽阳市| 宁乡县| 临沧市| 奉新县| 商洛市| 辽阳市| 罗江县| 甘谷县| 河北省| 奉贤区| 马尔康县| 石嘴山市| 固原市| 合阳县| 陆良县| 普定县| 平定县| 普兰县| 安宁市| 鹤山市| 潍坊市| 阿城市| 西乌| 安顺市| 常州市| 江达县| 松阳县| 右玉县| 纳雍县| 巴林左旗| 工布江达县| 鹤峰县| 新沂市| 海口市|