- Programming MapReduce with Scalding
- Antonios Chalkiopoulos
- 667字
- 2021-12-08 12:44:22
Scala basics
Scala modernizes Java's object-oriented approach while adding in the mix functional programming. It compiles into byte-code, and it can be executed on any Java virtual machine; thus, libraries and classes of Java and Scala communicate seamlessly.
Scala, similar to Java, is a statically typed programming language but can infer type information. It can infer that t
is a String
type in the following example:
val t = "Text"
Semicolons are not required when terminating commands. Variables are declared, with var
and constants with val
, and Scala favors immutability, which means that we should try to minimize the usage of variables.
Scala is fully object-oriented and functional. There are no primitives, like float or int only objects such as Int, Long, Double, String, Boolean, Float. Also there is no null.
The Scala equivalent of Java interfaces is called trait. Scala allows traits to be partially implemented, that is, it is possible to define default implementations for some methods. A Scala class can extend another class and implement multiple traits using the with
keyword.
Lists are the most valuable data type in any functional language. Lists are immutable and homogeneous, which means that all elements of a list are of the same type. Lists provide methods and higher-order functions. Some notable ones are as follows:
list1 ::: list2
: This is used to append the two listslist.reverse
: This is used to return a list in reverse orderlist.mkString(string)
: This is used to concatenate the list elements using a string in between themlist.map(function)
: This is used to return a new list with a function applied to each elementlist.filter(predicate)
: This is used to return a list with elements for which the predicate is truelist.sortWith(comparisonFunction)
: This is used to return a sorted list using a two parameter comparison function
Understanding the higher order functions of Scala Lists is very beneficial for developing in Scalding. In the next chapter, we will see that Scalding provides implementations of the same functions with similar functionality which work on pipes that contain tuples.
For example, the Scala function flatMap
removes one level of nesting by applying a function to each element of each sublist. The same function in Scalding, also removes one level of nesting by iterating through a collection to generate new rows.
Another interesting Scala function is groupBy
, which returns a Map
of key → values, where the keys are the results of applying a function to each element of the list, and the values are a List
of values so that applying the function to each value yields that key:
List("one", "two", "three").groupBy(x => x.length) gives Map(5 -> List(three), 3 -> List(one, two))
Tuples are containers that, unlike an array or a list, can hold objects with different types. A Scala tuple consists of 2 to 22 comma-separated objects enclosed in parentheses and is immutable. To access the nth value in a tuple t, we can use the notation t._n
, where n
is a literal integer in the range 1 (not 0!) to 22.
To avoid the primitive null
that causes many issues, Scala provides the Option. Options are parameterized types. For example, one may have an Option[String]
type with possible values Some(value)
(where the value is of correct type) or None
(when no value has been found).
Methods in Scala are public by default and can have private or protected access similar to Java. The syntax is:
def methodName(arg1: type, argN:type) { body } // returns Unit def methodName(arg1: type, .. , argN:type) : returnType = { body }
Another aspect of Scala is function literals. A function literal (also called anonymous function) is an alternate syntax for defining a function. It is useful to define one-liners and pass a function as an argument to a method. The syntax is (arg1: Type1, ..., argN:TypeN) => expression
. Thus, when implementing the function in string.map(function)
, we can avoid defining an external function by using the following:
"aBcDeF".map(x => x toLower) // or for a single parameter, just _ "aBcDeF".map(_.toLower)
- PHP動態網站程序設計
- R語言數據分析從入門到精通
- GraphQL學習指南
- Apache Spark 2.x Machine Learning Cookbook
- 從0到1:HTML+CSS快速上手
- Haskell Data Analysis Cookbook
- Windows Phone 7.5:Building Location-aware Applications
- Keras深度學習實戰
- Learning jQuery(Fourth Edition)
- Python語言實用教程
- 從零開始學C#
- NGUI for Unity
- Mastering Leap Motion
- Java從入門到精通(視頻實戰版)
- PHP+MySQL Web應用開發教程