官术网_书友最值得收藏!

Chapter 3. Parallelization Using Reducers

Reducers are another way of looking at collections in Clojure. In this chapter, we will study this particular abstraction of collections, and how it is quite orthogonal to viewing collections as sequences. The motivation behind reducers is to increase the performance of computations over collections. This performance gain is achieved mainly through parallelization of such computations.

As we have seen in Chapter 1, Working with Sequences and Patterns, sequences and laziness are a great way to handle collections. The Clojure standard library provides several functions to handle and manipulate sequences. However, abstracting a collection as a sequence has an unfortunate consequence; any computation performed over all the elements of a sequence is inherently sequential. Also, all of the standard sequence functions create a new collection that is similar to the collection passed to these functions. Interestingly, performing a computation over a collection without creating a similar collection, even as an intermediary result, is quite useful. For example, it is often required to reduce a given collection to a single value through a series of transformations in an iterative manner. This sort of computation does not necessarily require the intermediary results of each transformation to be saved.

A consequence of iteratively computing values from a collection is that we cannot parallelize it in a straightforward way. Modern MapReduce frameworks handle this kind of computation by pipelining the elements of a collection through several transformations in parallel, and finally, reducing the results into a single result. Of course, the result could as well be a new collection. A drawback of this methodology is that it produces concrete collections as intermediate results of each transformation, which is rather wasteful. For example, if we wanted to filter out values from a collection, the MapReduce strategy would require creating empty collections to represent values that are left out of the reduction step to produce the final result.

This incurs unnecessary memory allocation and also creates additional work for the reduction step, which produces the final result. Hence, there's a scope for optimizing these sorts of computations.

This brings us to the notion of treating computations over collections as reducers to attain better performance. Of course, this doesn't mean that reducers are a replacement for sequences. Sequences and laziness are great for abstracting computations that create and manipulate collections, while reducers are a specialized high-performance abstraction of collections in which a collection needs to be piped through several transformations, and finally, combined to produce the final result. Reducers achieve a performance gain in the following ways:

  • Reducing the amount of memory allocated to produce the desired result
  • Parallelizing the process of reducing a collection into a single result, which could be an entirely new collection

The clojure.core.reducers namespace provides several functions to process collections using reducers. Let's now examine how reducers are implemented and a few examples that demonstrate how reducers can be used.

主站蜘蛛池模板: 巴南区| 新河县| 尖扎县| 婺源县| 衡山县| 黑水县| 汕头市| 名山县| 合水县| 贵德县| 内江市| 聊城市| 遵义县| 岑巩县| 拉萨市| 津南区| 额敏县| 观塘区| 隆子县| 禹城市| 抚顺县| 同心县| 田林县| 乐安县| 涞源县| 浦县| 应用必备| 航空| 平度市| 星座| 吉水县| 海淀区| 临武县| 鄂尔多斯市| 舒城县| 兴海县| 太原市| 金阳县| 凉山| 兰西县| 来安县|