The simultaneous execution of several computations is termed as parallelism. The use of parallelism tends to increase the overall performance of a computation, since the computation can be partitioned to execute on several cores or processors. Clojure has a couple of functions that can be used for the parallelization of a particular computation or task, and we will briefly examine them in this section.
Note
The following examples can be found in src/m_clj/c2/parallel.clj of the book's source code.
Suppose we have a function that pauses the current thread for some time and then returns a computed value, as depicted in Example 2.17:
(defn square-slowly [x]
(Thread/sleep 2000)
(* x x))
Example 2.17: A function that pauses the current thread
The function square-slowly in Example 2.17 requires a single argument x. This function pauses the current thread for two seconds and returns the square of its argument x. If the function square-slowly is invoked over a collection of three values using the map function, it takes three times as long to complete, as shown here:
The previously shown map form returns a lazy sequence, and hence the doall form is required to realize the value returned by the map form. We could also use the dorun form to perform this realization of a lazy sequence. The entire expression is evaluated in about six seconds, which is thrice the time taken by the square-slowly function to complete. We can parallelize the application of the square-slowly function using the pmap function instead of map, as shown here:
The entire expression now evaluates in the same amount of time required for a single call to the square-slowly function. This is due to the square-slowly function being called in parallel over the supplied collection by the pmap form. Thus, the pmap form has the same semantics as that of the map form, except that it applies the supplied function in parallel.
The pvalues and pcalls forms can also be used to parallelize computations. The pvalues form evaluates the expressions passed to it in parallel, and returns a lazy sequence of the resulting values. Similarly, the pcalls form invokes all functions passed to it, which must take no arguments, in parallel and returns a lazy sequence of the values returned by these functions:
As shown in the preceding output, both expressions that use the pvalues and pcalls forms take the same amount of time to evaluate as a single call to the square-slowly function.
Note
The pmap, pvalues, and pcalls forms all return lazy sequences that have to be realized using the doall or dorun form.
Controlling parallelism with thread pools
The pmap form schedules parallel execution of the supplied function on the default threadpool. If we wish to configure or tweak the threadpool used by pmap, the claypoole library (https://github.com/TheClimateCorporation/claypoole) is a good option. This library provides an implementation of the pmap form that must be passed a configurable threadpool. We will now demonstrate how we can use this library to parallelize a given function.
Note
The following library dependencies are required for the upcoming examples:
[com.climate/claypoole "1.0.0"]
Also, the following namespaces must be included in your namespace declaration:
The pmap function from the com.climate.claypoole namespace is essentially a variant of the standard pmap function to which we supply a threadpool instance to be used in parallelizing a given function. We can also supply the number of threads to be used by this variant of the pmap function in order to parallelize a given function, as shown here:
As previously shown, the pmap function from the claypoole library can be used to parallelize the square-slowly function that we defined earlier in Example 2.17 over a collection of three values. These three elements are computed over in two batches, in which each batch will parallely apply the square-slowly function over two elements in two separate threads. Since the square-slowly function takes two seconds to complete, the total time taken to compute over the collection of three elements is around four seconds.
We can create an instance of a pool of threads using the threadpool function from the claypoole library. This threadpool instance can then be passed to the pmap function from the claypoole library. The com.climate.claypoole namespace also provides the ncpus function that returns the number of physical processors available to the current process. We can create a threadpool instance and pass it to this variant of the pmap function as shown here:
Assuming that we are running the preceding code on a computer system that has two physical processors, the call to the threadpool function shown previously will create a threadpool of two threads. This threadpool instance can then be passed to the pmap function as shown in the preceding example.
Note
We can fall back to the standard behavior of the pmap function by passing the :builtin keyword as the first argument to the com.climate.claypoole/pmap function. Similarly, if the keyword :serial is passed as the first argument to the claypoole version of the pmap function, the function behaves like the standard map function.
The threadpool function also supports a couple of useful key options. Firstly, we can create a pool of non-daemon threads using the :daemon false optional argument. Daemon threads are killed when the process exits, and all threadpools created by the threadpool function are pools of daemon threads by default. We can also name a threadpool using the :name key option of the threadpool function. The :thread-priority key option can be used to indicate the priority of the threads in the new threadpool.
Tasks can also be prioritized using the pmap, priority-threadpool, and with-priority forms from the claypoole library. A priority threadpool is created using the priority-threadpool function, and this new threadpool can be used along with the with-priority function to assign a priority to a task that must be parallelized using pmap, as shown here:
user> (def pool (cp/priority-threadpool (cp/ncpus))
#'user/pool
user> (def task-1 (cp/pmap (cp/with-priority pool 1000) square-slowly [10 10 10]))
#'user/task-1
user> (def task-2 (cp/pmap (cp/with-priority pool 0) square-slowly [5 5 5]))
#'user/task-2
Tasks with higher priority are assigned to threads first. Hence, the task represented by task-1 will be assigned to a thread of execution before the task represented by task-2 in the previous output.
To gracefully deallocate a given threadpool, we can call the shutdown function from the com.climate.claypoole namespace, which accepts a threadpool instance as its only argument. The shutdown! function from the same namespace will forcibly shut down the threads in a threadpool. The shutdown! function can also be called using the with-shutdown! macro. We specify the threadpools to be used for a series of computations as a vector of bindings to the with-shutdown! macro. This macro will implicitly call the shutdown! function on all of the threadpools that it has created once all the computations in the body of this macro are completed. For example, we can define a function to create a threadpool, use it for a computation, and finally, shut down the threadpool, using the with-shutdown! function as shown in Example 2.18:
The square-slowly-with-pool function defined in Example 2.18 will create a new threadpool, represented by pool, and then use it to call the pmap function. The shutdown! function is implicitly called once the doall form completely evaluates the lazy sequence returned by the pmap function.
The claypoole library also supports unordered parallelism, in which results of inpidual threads of computation are used as soon as they are available in order to minimize latency. The com.climate.claypoole/upmap function is an unordered parallel version of the pmap function.
The com.climate.claypoole namespace also provides several other functions that use threadpools, as described here:
The com.climate.claypoole/pvalues function is a threadpool-based implementation of the pvalues function. It will evaluate its arguments in parallel using a supplied threadpool and return a lazy sequence.
The com.climate.claypoole/pcalls function is a threadpool-based version of the pcalls function, which invokes several no-argument functions to return a lazy sequence.
A future that uses a given threadpool can be created using the com.climate.claypoole/future function.
We can evaluate an expression in a parallel fashion over the items in a given collection using the com.climate.claypoole/pfor function.
The upvalues, upcalls, and upfor functions in the com.climate.claypoole namespace are unordered parallel versions of the pvalues, pcalls, and pfor functions, respectively, from the same namespace.
It is quite evident that the pmap function from the com.climate.claypoole namespace will eagerly evaluate the collection it is supplied. This may be undesirable when we intend to call pmap over an infinite sequence. The com.climate.claypoole.lazy namespace provides versions of pmap and other functions from the com.climate.claypoole namespace that preserve the laziness of a supplied collection. The lazy version of the pmap function can be demonstrated as follows:
The previously defined lazy-pmap sequence is a lazy sequence created by mapping the square-slowly function over the infinite sequence (range). As shown previously, the call to the pmap function returns immediately, and the first four elements of the resulting lazy sequence are realized in parallel using the doall and take functions.
To summarize, Clojure has the pmap, pvalues, and pcalls primitives to deal with parallel computations. If we intend to control the amount of parallelism utilized by these functions, we can use the claypoole library's implementations of these primitives. The claypoole library also supports other useful features such as prioritized threadpools and unordered parallelism.