- Mastering Elixir
- André Albuquerque Daniel Caixinha
- 829字
- 2021-08-05 10:42:48
Lazy processing with the stream module
We will now talk about a different way of processing collections, which, as functional programming, may require a shift in your mindset. Before talking about lazy processing, let's enumerate some of the shortcomings of working with the Enum module. The Enum module is referred to as being eager. This means that when processing a collection, this module will load the entire collection into memory. Furthermore, if you have a chain of functions you want to apply to a collection, the Enum module will iterate through your collection as many times as the functions are applying to it. Let's examine this further with an example:
iex> [1, 2, 3, 4, 5] \
...> |> Enum.map(&(&1 + 10)) \
...> |> Enum.zip(["a", "b", "c", "d", "e"])
[{11, "a"}, {12, "b"}, {13, "c"}, {14, "d"}, {15, "e"}]
We take our initial collection and iterate it to add 10 to each element inside it. This generates a new list, which is passed to our next function. This function will zip the two lists together, which will produce a new list, which is returned to us. In this simple example, we need to traverse our list twice to build the desired result.
This is where the Stream module, and lazy processing, becomes advantageous. When working with lazy enumerables, the entire collection never gets loaded into memory, and contrary to what we're accustomed to, the computations aren't made right away. The results are produced as they are needed. Let's see this same example with the Stream module:
iex> [1, 2, 3, 4, 5] \
...> |> Stream.map(&(&1 + 1)) \
...> |> Stream.zip(["a", "b", "c", "d", "e"])
#Function<66.40091930/2 in Stream.zip/1>
As you can see, we're not getting our final list back. When we feed our list to Stream.map, the list is not iterated. Instead, the functions that will be applied on it are saved into a structure (along with the collection we're working on). We can then pass this structure into the next function, which will further save a new function to be applied to our list. This is really cool! But how do we make it return the result we're expecting? Just treat it as a regular (eager) enumerable, by applying a function from the Enum module, and it will start to produce results.
To exemplify this, we'll use the Enum.take/2 function, which allows us to take a given number of items from an enumerable:
iex> [1, 2, 3, 4, 5] \
...> |> Stream.map(&(&1 + 10)) \
...> |> Stream.zip(["a", "b", "c", "d", "e"]) \
...> |> Enum.take(1)
[{11, "a"}]
As you can see, we're now getting the expected result back. Note that this is not a result of applying our computation to all the list and then just taking the first element. We've essentially only computed results for the first element, as that's all that was necessary. If you wanted to have the full list in the end, you could use the Enum.to_list/1 function.
Streams are a really nimble way to process large, or even infinite, collections. Imagine that you're parsing values from a huge CSV file, and then running some functions on them. If you're running your application on the cloud, as most of us are these days, you probably have a short amount of memory. Using lazy processing, you can avoid having to load the whole file, processing it line by line. If you're processing an infinite collection, such as an RSS feed, lazy processing is also a great solution, as you can process each element of the collection incrementally, as they arrive.
Note that while the Stream module is amazing, it will not replace your usage of the Enum module. It's certainly great for very large collections, or even if you have a big chain of functions being applied to a collection and only want to traverse it once. However, for small or even medium collections, the Stream module will perform worse, as you're adding a lot of overhead, for instance, by having to save the functions you'll apply instead of applying them right away. Always analyze your situation carefully and take this into account when choosing to use the Enum or the Stream module for a given task.
We'll be using functions from the Stream module in the application we'll build in this book. You'll learn more about the Stream module in Chapter 4, Powered by Erlang/OTP.
- Spring Cloud Alibaba微服務(wù)架構(gòu)設(shè)計與開發(fā)實戰(zhàn)
- Rust實戰(zhàn)
- JSP開發(fā)案例教程
- ASP.NET 3.5程序設(shè)計與項目實踐
- Elasticsearch for Hadoop
- 程序設(shè)計基礎(chǔ)教程:C語言
- Learning Probabilistic Graphical Models in R
- 深入淺出React和Redux
- Learning AngularJS for .NET Developers
- Clojure for Machine Learning
- R語言:邁向大數(shù)據(jù)之路(加強版)
- Laravel Application Development Blueprints
- Flink技術(shù)內(nèi)幕:架構(gòu)設(shè)計與實現(xiàn)原理
- 深度實踐KVM:核心技術(shù)、管理運維、性能優(yōu)化與項目實施
- Akka入門與實踐