- Programming MapReduce with Scalding
- Antonios Chalkiopoulos
- 667字
- 2021-12-08 12:44:22
Scala basics
Scala modernizes Java's object-oriented approach while adding in the mix functional programming. It compiles into byte-code, and it can be executed on any Java virtual machine; thus, libraries and classes of Java and Scala communicate seamlessly.
Scala, similar to Java, is a statically typed programming language but can infer type information. It can infer that t
is a String
type in the following example:
val t = "Text"
Semicolons are not required when terminating commands. Variables are declared, with var
and constants with val
, and Scala favors immutability, which means that we should try to minimize the usage of variables.
Scala is fully object-oriented and functional. There are no primitives, like float or int only objects such as Int, Long, Double, String, Boolean, Float. Also there is no null.
The Scala equivalent of Java interfaces is called trait. Scala allows traits to be partially implemented, that is, it is possible to define default implementations for some methods. A Scala class can extend another class and implement multiple traits using the with
keyword.
Lists are the most valuable data type in any functional language. Lists are immutable and homogeneous, which means that all elements of a list are of the same type. Lists provide methods and higher-order functions. Some notable ones are as follows:
list1 ::: list2
: This is used to append the two listslist.reverse
: This is used to return a list in reverse orderlist.mkString(string)
: This is used to concatenate the list elements using a string in between themlist.map(function)
: This is used to return a new list with a function applied to each elementlist.filter(predicate)
: This is used to return a list with elements for which the predicate is truelist.sortWith(comparisonFunction)
: This is used to return a sorted list using a two parameter comparison function
Understanding the higher order functions of Scala Lists is very beneficial for developing in Scalding. In the next chapter, we will see that Scalding provides implementations of the same functions with similar functionality which work on pipes that contain tuples.
For example, the Scala function flatMap
removes one level of nesting by applying a function to each element of each sublist. The same function in Scalding, also removes one level of nesting by iterating through a collection to generate new rows.
Another interesting Scala function is groupBy
, which returns a Map
of key → values, where the keys are the results of applying a function to each element of the list, and the values are a List
of values so that applying the function to each value yields that key:
List("one", "two", "three").groupBy(x => x.length) gives Map(5 -> List(three), 3 -> List(one, two))
Tuples are containers that, unlike an array or a list, can hold objects with different types. A Scala tuple consists of 2 to 22 comma-separated objects enclosed in parentheses and is immutable. To access the nth value in a tuple t, we can use the notation t._n
, where n
is a literal integer in the range 1 (not 0!) to 22.
To avoid the primitive null
that causes many issues, Scala provides the Option. Options are parameterized types. For example, one may have an Option[String]
type with possible values Some(value)
(where the value is of correct type) or None
(when no value has been found).
Methods in Scala are public by default and can have private or protected access similar to Java. The syntax is:
def methodName(arg1: type, argN:type) { body } // returns Unit def methodName(arg1: type, .. , argN:type) : returnType = { body }
Another aspect of Scala is function literals. A function literal (also called anonymous function) is an alternate syntax for defining a function. It is useful to define one-liners and pass a function as an argument to a method. The syntax is (arg1: Type1, ..., argN:TypeN) => expression
. Thus, when implementing the function in string.map(function)
, we can avoid defining an external function by using the following:
"aBcDeF".map(x => x toLower) // or for a single parameter, just _ "aBcDeF".map(_.toLower)
- Visual Basic .NET程序設計(第3版)
- 流量的秘密:Google Analytics網站分析與優化技巧(第2版)
- LabVIEW 2018 虛擬儀器程序設計
- Oracle Database In-Memory(架構與實踐)
- Python金融數據分析
- YARN Essentials
- Data Analysis with Stata
- C語言實驗指導及習題解析
- Linux命令行與shell腳本編程大全(第4版)
- RubyMotion iOS Develoment Essentials
- 零基礎C#學習筆記
- Get Your Hands Dirty on Clean Architecture
- 零基礎學SQL(升級版)
- OpenStack Sahara Essentials
- jQuery EasyUI從零開始學