- Programming MapReduce with Scalding
- Antonios Chalkiopoulos
- 503字
- 2021-12-08 12:44:22
Why Scala?
Development has evolved a lot since Java was originally invented 20 years ago. Java, as an imperative language, was designed for the Von-Neumann architecture, where a computer consists of a processor, a memory, and a bus that reads both instructions and data from the memory into the processor. In that architecture, it is safe to store values in variables, and then mutate them by assigning new values. Loop controls are thus normal to use, as shown in the following code:
for ( int i=0; i < 1000000; i++) { a=a+1; }
However, over the past decade, hardware engineers have been stressing that the Von-Neumann model is no longer sustainable. Since processors hit physical limitations at high frequencies, engineers look for evolution beyond the single-processor model. Nowadays, manufacturers integrate multiple cores onto a single integrated circuit die—a multiprocessor chip. Similarly, the emergence of cloud computing and Hadoop clusters bring into play another dimension in computing, where resources are distributed across different nodes.
The imperative programming style dictates thinking in terms of time. In distributed programming, we need to think in terms of space: build one block, then another, and then build another block—like building in Lego. When building in space, it is easier to build each block on a different process and parallelize the execution of the required blocks.
Unfortunately, the imperative logic is not compatible with modern distributed systems, cloud applications, and scalable systems. In practice, in parallelized systems, it is unsafe to assign a new value to a variable as this happens in a single node and other nodes are not aware of the local change. For this reason, the simple for
loop cannot be parallelized into 10 or 100 nodes.
Effective software development techniques and language design evolved over the past decade, and as such, Scala is an advanced scalable language that restricts imperative features and promotes the development of functional and parallelized code blocks. Scala keeps the object-oriented model and provides functional capabilities and other cool features.
Moreover, Scala significantly reduces boilerplate code. Consider a simple Java class, as shown in the following code:
public class Person { public final String name; public final int age; Person(String name, int age) { this.name=name; this.age=age; } }
The preceding code can be expressed in Scala with just a single line, as shown:
case class Person(val name: String, val age: Int)
For distributed computations and parallelism, Scala offers collections. Splitting an array of objects into two separate arrays can be achieved using distributed collections, as shown in the following code:
val people: Array[Person] val (minors,adults) = people partition (_.age < 18)
For concurrency, the Actor model provides actors that are similar to objects but inherently concurrent, uses message-passing for communication, and is designed to create an infinite number of new actors. In effect, an actor under stress from a number of asynchronous requests can generate more actors that live in different computers and JVMs and have the network topology updated to achieve dynamic autoscaling through load balancing.
- Web全棧工程師的自我修養
- HTML5+CSS3+JavaScript Web開發案例教程(在線實訓版)
- HTML5+CSS3網站設計基礎教程
- Bootstrap 4:Responsive Web Design
- H5頁面設計:Mugeda版(微課版)
- Swift細致入門與最佳實踐
- 基于Struts、Hibernate、Spring架構的Web應用開發
- Mastering HTML5 Forms
- 量子計算機編程:從入門到實踐
- Mastering ArcGIS Server Development with JavaScript
- Flutter之旅
- Manage Your SAP Projects with SAP Activate
- Magento 2 Developer's Guide
- Node.js進階之路
- 計算機信息技術實踐教程