官术网_书友最值得收藏!

Iterators

You previously learned about several iterable objects, such as lists, sets, and tuples. In Python, a data type is considered an iterator if an __iter__ method is defined or if elements can be accessed in a sequenced manner. These three data types (that is, lists, sets, and tuples) allow us to iterate through their contents in a simple and efficient manner. For this reason, we often use these data types when iterating through the lines in a file or through file entries within a directory listing, or when trying to identify a file based on a series of file signatures.

The iter data type allows us to step through data in a manner that doesn't preserve the initial object. This seems undesirable; however, when working with large sets or on machines with limited resources, it is very useful. This is due to the resource allocation associated with the iter data type, where only active data is stored in memory. This preserves memory allocation when stepping through every line of a 3 GB file by feeding one line at a time and preventing massive memory consumption while still handling each line in order.

The code block mentioned here steps through the basic usage of iterables. We use the next() function on an iterable to retrieve the next element. Once an object is accessed using next(), it is no longer available in iter(), as the cursor has moved past the element. If we have reached the end of the iterable object, we will receive StopIteration for any additional next() method calls. This exception allows us to gracefully exit loops with an iterator and alerts us to when we are out of content to read from the iterator:

>>> y = iter([1, 2, 3])
>>> next(y)
1
>>> next(y)
2
>>> next(y)
3
>>> next(y)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
In Python 2.7, you can use the obj.next() method call to get the same output as the preceding example via use of the next() function. For simplicity and uniformity, Python 3 renamed obj.next() to obj.__next__() and encourages the use of the next() function. With this, it is recommended to use next(y), as shown previously, in place of y.next() or y.__next__().

The reversed() built-in function can be used to create a reversed iterator. In the following example, we reverse a list and retrieve the following object from the iterator using the next() function:

>>> j = reversed([7, 8, 9])
>>> next(j)
9
>>> next(j)
8
>>> next(j)
7
>>> next(j)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

By implementing generators, we can further take advantage of the iter data type. Generators are a special type of function that produces iterator objects. Generators are similar to functions, as those discussed in Chapter 1, Now for Something Completely Different—though, instead of returning objects, they yield iterators. Generators are best used with large datasets that would consume vast quantities of memory, similar to the use case of the iter data type.

The code block mentioned here shows the implementation of a generator. In the file_sigs() function, we create a list of tuples stored in the sigs variable. We then loop through each element in sigs and yield a tuple data type. This creates a generator, allowing us to use the next() function to retrieve each tuple individually and limit the generators' memory impact. See the following code:

>>> def file_sigs():
... sigs = [('jpeg', 'FF D8 FF E0'),
... ('png', '89 50 4E 47 0D 0A 1A 0A'),
... ('gif', '47 49 46 38 37 61')]
... for s in sigs:
... yield s

>>> fs = file_sigs()
>>> next(fs)
('jpeg', 'FF D8 FF E0')
>>> next(fs)
('png', '89 50 4E 47 0D 0A 1A 0A')
>>> next(fs)
('gif', '47 49 46 38 37 61')
You can find additional file signatures at http://www.garykessler.net/library/file_sigs.html .
主站蜘蛛池模板: 永新县| 渭南市| 南召县| 峨眉山市| 错那县| 盐城市| 晋州市| 黔西县| 宜黄县| 格尔木市| 抚远县| 和林格尔县| 福安市| 尼勒克县| 京山县| 三原县| 金华市| 遂溪县| 盐山县| 宁德市| 化州市| 江达县| 朝阳区| 桦南县| 龙岩市| 兴安县| 韶山市| 仁化县| 钟山县| 新河县| 涿州市| 西乡县| 丘北县| 凌海市| 贡山| 新巴尔虎左旗| 浠水县| 常宁市| 山西省| 连南| 麦盖提县|