- Hands-On Big Data Modeling
- James Lee Tao Wei Suresh Kumar Mukhiya
- 185字
- 2021-06-10 18:58:53
Reasons to choose Apache Spark
Apache Spark is very popular in the big data community these days. Here are some of the most prominent reasons for using Apache Spark in big data modeling and computation:
- Speed: Speed is important in processing large datasets. Spark offers the ability to run computations up to one hundred times faster than Hadoop2 MapReduce in memory, or ten times faster on disk.
- Accessibility: Spark was developed to be highly accessible, offering simple APIs in Python, Java, Scala, and SQL, and rich built-in libraries. In addition to this, it also integrates with other big data tools, including Hadoop clusters and sources such as Cassandra3.
- Platform support: Apache spark was built to run on Hadoop and Mesos, standalone, or in the cloud. It can access diverse data sources, including HDFS, Cassandra, HBase, and S3.
- Generality: Spark was developed to cover a wide range of workloads, including batch applications, iterative algorithms, interactive queries, and streaming. By supporting these workloads in the same engine, Spark makes it easy and inexpensive to combine different processing types, which is often necessary for data analysis production pipelines.
推薦閱讀
- Getting Started with MariaDB
- Learning Apache Cassandra(Second Edition)
- 自主研拋機(jī)器人技術(shù)
- JavaScript典型應(yīng)用與最佳實(shí)踐
- 傳感器與新聞
- 網(wǎng)絡(luò)存儲(chǔ)·數(shù)據(jù)備份與還原
- Learning Apache Apex
- 數(shù)據(jù)要素:全球經(jīng)濟(jì)社會(huì)發(fā)展的新動(dòng)力
- 運(yùn)動(dòng)控制系統(tǒng)(第2版)
- 巧學(xué)活用Linux
- 電機(jī)與電力拖動(dòng)
- 歐姆龍CP1H型PLC編程與應(yīng)用
- 數(shù)據(jù)庫技術(shù):Access 2003·計(jì)算機(jī)網(wǎng)絡(luò)技術(shù)
- 從零開始學(xué)HTML+CSS
- Data Visualization with D3.js Cookbook