官术网_书友最值得收藏!

  • Mastering Elixir
  • André Albuquerque Daniel Caixinha
  • 829字
  • 2021-08-05 10:42:48

Lazy processing with the stream module

We will now talk about a different way of processing collections, which, as functional programming, may require a shift in your mindset. Before talking about lazy processing, let's enumerate some of the shortcomings of working with the Enum module. The Enum module is referred to as being eager. This means that when processing a collection, this module will load the entire collection into memory. Furthermore, if you have a chain of functions you want to apply to a collection, the Enum module will iterate through your collection as many times as the functions are applying to it. Let's examine this further with an example:

iex> [1, 2, 3, 4, 5] \
...> |> Enum.map(&(&1 + 10)) \
...> |> Enum.zip(["a", "b", "c", "d", "e"])
[{11, "a"}, {12, "b"}, {13, "c"}, {14, "d"}, {15, "e"}]
The \ on the end of the first two lines is to stop our Elixir console from evaluating this line right away, and wait for a new line instead. This way, we can write these operations with the pipe operator on multiple lines, which makes them more readable.

We take our initial collection and iterate it to add 10 to each element inside it. This generates a new list, which is passed to our next function. This function will zip the two lists together, which will produce a new list, which is returned to us. In this simple example, we need to traverse our list twice to build the desired result.

This is where the Stream module, and lazy processing, becomes advantageous. When working with lazy enumerables, the entire collection never gets loaded into memory, and contrary to what we're accustomed to, the computations aren't made right away. The results are produced as they are needed. Let's see this same example with the Stream module:

iex> [1, 2, 3, 4, 5] \
...> |> Stream.map(&(&1 + 1)) \
...> |> Stream.zip(["a", "b", "c", "d", "e"])
#Function<66.40091930/2 in Stream.zip/1>

As you can see, we're not getting our final list back. When we feed our list to Stream.map, the list is not iterated. Instead, the functions that will be applied on it are saved into a structure (along with the collection we're working on). We can then pass this structure into the next function, which will further save a new function to be applied to our list. This is really cool! But how do we make it return the result we're expecting? Just treat it as a regular (eager) enumerable, by applying a function from the Enum module, and it will start to produce results.

To exemplify this, we'll use the Enum.take/2 function, which allows us to take a given number of items from an enumerable:

iex> [1, 2, 3, 4, 5] \
...> |> Stream.map(&(&1 + 10)) \
...> |> Stream.zip(["a", "b", "c", "d", "e"]) \
...> |> Enum.take(1)
[{11, "a"}]

As you can see, we're now getting the expected result back. Note that this is not a result of applying our computation to all the list and then just taking the first element. We've essentially only computed results for the first element, as that's all that was necessary. If you wanted to have the full list in the end, you could use the Enum.to_list/1 function.

Streams are a really nimble way to process large, or even infinite, collections. Imagine that you're parsing values from a huge CSV file, and then running some functions on them. If you're running your application on the cloud, as most of us are these days, you probably have a short amount of memory. Using lazy processing, you can avoid having to load the whole file, processing it line by line. If you're processing an infinite collection, such as an RSS feed, lazy processing is also a great solution, as you can process each element of the collection incrementally, as they arrive.

Note that while the Stream module is amazing, it will not replace your usage of the Enum module. It's certainly great for very large collections, or even if you have a big chain of functions being applied to a collection and only want to traverse it once. However, for small or even medium collections, the Stream module will perform worse, as you're adding a lot of overhead, for instance, by having to save the functions you'll apply instead of applying them right away. Always analyze your situation carefully and take this into account when choosing to use the Enum or the Stream module for a given task.

We'll be using functions from the Stream module in the application we'll build in this book. You'll learn more about the Stream module in Chapter 4, Powered by Erlang/OTP.

Elixir provides some functions that wrap most of the complex parts of building streams. If you want to build your own lazy stream, check out these functions from the Stream module: cycle, repeatedly, iterate, unfold, and resource. The full documentation for the Stream can be found at https://hexdocs.pm/elixir/Stream.html.
主站蜘蛛池模板: 白沙| 新河县| 潞城市| 阿合奇县| 疏附县| 西畴县| 陵川县| 图木舒克市| 无锡市| 保定市| 贵阳市| 云阳县| 抚松县| 台南县| 曲靖市| 长岭县| 蓬溪县| 岳普湖县| 县级市| 双峰县| 灵寿县| 邹平县| 南岸区| 嘉鱼县| 麻江县| 隆德县| 开鲁县| 萍乡市| 修武县| 天气| 巩留县| 含山县| 将乐县| 金华市| 海盐县| 通辽市| 那坡县| 建平县| 云霄县| 大同市| 广德县|