Iterables, Iterators, and Generators

What is “iterable”?

According to the documentation, all sequences types, such as str, list, tuple, and some non-sequence types like set and dict are iterable.

for n in [1, 3, 5]:
    print(n)
# get 1, 3, 5 respectively

d = {'a': 232, 'b': 72}
for k, v in d:
    print(k, v)

In other words, temporarily, we can say the data types that support for loop statement, are iterable.

Inside for loop statement

However, how does for loop work? In fact, when we for loop a data type, it’s equivalent to:

lst = [1, 3, 5, 7, 9]
it = iter(lst)  # 'it'
while True:
    try:
        print(next(it))
    except StopIteration:
        del it
        break

1
3
5
7
9

Wait… what is iter and next?

When we invoke iter(lst), we actually call lst.__iter__(), which returns a iterator. And then we use next(lst)in while statement, it’s also equivalent to lst.__next__(). next(lst) fetches the next element of lst. When there are no more elements, the iterator, it, raises a StopIterationexception.

Above all, for statement:

  1. converts the iterable object to an iterator
  2. fetches all the elements of the iterator one by one
  3. handles StopIteration exception silently

Whenever the interpreter needs to iterate over an object x, it automatically calls iter(x). The iter built-in function:

  1. Checks whether the object implements __iter__, and calls that to obtain an iterator.
  2. If __iter__ is not implemented, but __getitem__ is implemented, Python creates an iterator that attempts to fetch items in order, starting from index 0 (zero).
  3. If that fails, Python raises TypeError, usually saying “C object is not iterable,” where C is the class of the target object. — Luciano Ramalho, Fluent Python

Therefore, we should implement __iter__ to the iterables, and then iterables builds the iterators, which consists of __next__ and __iter__ (which returns itself).

Generators

Any Python function that has the yield keyword in its body is a generator function: a function which, when called, returns a generator object. In other words, a generator function is a generator factory. The only syntax distinguishing a plain function from a generator function is the fact that the latter has a yield keyword some‐where in its body.

Let’s take an example:

>>> def gen():
...     yield 1
...     yield 2
...     yield 'Hello'

>>> gen
<function gen at 0x105ed0730>

>>> gen()
<generator object gen at 0x1062c8db0>

>>> for i in gen():
...     print(i)
1
2
Hello


>>> g = gen()

>>> next(g)
1

>>> next(g)
2

>>> next(g)
'Hello'

>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

In the example above, a generator function (gen) builds a generator object g. When we invoke next(g), it returns next yield value to us and suspended. And finally, when it returns all the values, the generator object raises StopIteration.

References

  1. Fluent Python, Luciano Ramalho
  2. Generator Tricks for Systems Programmers, David Beazley

Comments