官术网_书友最值得收藏!

Comprehensions and generators

In this section, we will explore a few simple strategies to speed up Python loops using comprehension and generators. In Python, comprehension and generator expressions are fairly optimized operations and should be preferred in place of explicit for-loops. Another reason to use this construct is readability; even if the speedup over a standard loop is modest, the comprehension and generator syntax is more compact and (most of the times) more intuitive.

In the following example, we can see that both the list comprehension and generator expressions are faster than an explicit loop when combined with the sum function:

    def loop(): 
res = []
for i in range(100000):
res.append(i * i)
return sum(res)

def comprehension():
return sum([i * i for i in range(100000)])

def generator():
return sum(i * i for i in range(100000))

%timeit loop()
100 loops, best of 3: 16.1 ms per loop
%timeit comprehension()
100 loops, best of 3: 10.1 ms per loop
%timeit generator()
100 loops, best of 3: 12.4 ms per loop

Just like lists, it is possible to use dict comprehension to build dictionaries slightly more efficiently and compactly, as shown in the following code:

    def loop(): 
res = {}
for i in range(100000):
res[i] = i
return res

def comprehension():
return {i: i for i in range(100000)}
%timeit loop()
100 loops, best of 3: 13.2 ms per loop
%timeit comprehension()
100 loops, best of 3: 12.8 ms per loop

Efficient looping (especially in terms of memory) can be implemented using iterators and functions such as filter and map. As an example, consider the problem of applying a series of operations to a list using list comprehension and then taking the maximum value:

    def map_comprehension(numbers):
a = [n * 2 for n in numbers]
b = [n ** 2 for n in a]
c = [n ** 0.33 for n in b]
return max(c)

The problem with this approach is that for every list comprehension, we are allocating a new list, increasing memory usage. Instead of using list comprehension, we can employ generators. Generators are objects that, when iterated upon, compute a value on the fly and return the result.

For example, the map function takes two arguments--a function and an iterator--and returns a generator that applies the function to every element of the collection. The important point is that the operation happens only while we are iterating, and not when map is invoked!

We can rewrite the previous function using map and by creating intermediate generators, rather than lists, thus saving memory by computing the values on the fly:

    def map_normal(numbers):
a = map(lambda n: n * 2, numbers)
b = map(lambda n: n ** 2, a)
c = map(lambda n: n ** 0.33, b)
return max(c)

We can profile the memory of the two solutions using the memory_profiler extension from an IPython session. The extension provides a small utility, %memit, that will help us evaluate the memory usage of a Python statement in a way similar to %timeit, as illustrated in the following snippet:

    %load_ext memory_profiler
numbers = range(1000000)
%memit map_comprehension(numbers)
peak memory: 166.33 MiB, increment: 102.54 MiB
%memit map_normal(numbers)
peak memory: 71.04 MiB, increment: 0.00 MiB

As you can see, the memory used by the first version is 102.54 MiB, while the second version consumes 0.00 MiB! For the interested reader, more functions that return generators can be found in the itertools module, which provides a set of utilities designed to handle common iteration patterns.

主站蜘蛛池模板: 盘锦市| 罗源县| 门头沟区| 东乌珠穆沁旗| 烟台市| 长泰县| 扬中市| 西乌珠穆沁旗| 平遥县| 陇川县| 秭归县| 绥宁县| 辽宁省| 聊城市| 广宗县| 临朐县| 商南县| 南和县| 宁远县| 成都市| 昌平区| 阳信县| 闽清县| 漯河市| 惠州市| 万山特区| 兴城市| 城步| 苏尼特右旗| 湘潭县| 自贡市| 佛山市| 远安县| 尤溪县| 依安县| 农安县| 洱源县| 滕州市| 和硕县| 新乡市| 兰溪市|