- Source: itsCoder's WeeklyBolg project
- itsCoder homepage: http://itscoder.com/
- Author: Manjusaka
- Reviewers: allenwu, Brucezz
Foreword#
I originally planned to continue writing about Flask this week, but after some thought, I decided to change things up and talk about the very important but often misunderstood generators and coroutines in Python.
Generators Explained#
I guess everyone is somewhat familiar with generators, but to allow me to continue to show off, let's take some time to explain what a generator is.
For example, in Python, if we want to generate a list within the range (1,100000), we might write the following code without much thought:
def generateList(start, stop):
tempList = []
for i in range(start, stop):
tempList.append(i)
return tempList
Note 1: Some might ask why we don't just return
range(start, stop)
. Nice question! This involves a fundamental issue regarding howrange
works. This varies by version; in Python 2.x,range(start, stop)
essentially pre-generates alist
, and thelist
object is an Iterator, which can be used in afor
statement.
In Python 2.x, there is also a statement calledxrange
, which generates a Generator object.
In Python 3, things changed a bit; the community felt that the split betweenrange
andxrange
was too cumbersome, so they merged them. Now in Python 3, the syntax forxrange
has been removed, and the mechanism forrange
has changed to generate a Generator instead of alist
.
But have you ever considered the problem that if we want to generate a very large amount of data, pre-generating it is undoubtedly unwise, as it would consume a lot of memory? Thus, Python provides us with a new approach: Generator.
def generateList1(start, stop):
for i in range(start, stop):
yield i
if __name__ == "__main__":
c = generateList1(1, 100000)
for i in c:
print(i)
Yes, one of the features of a Generator is that it does not generate data all at once; instead, it generates an iterable object that controls its activation timing based on the logic we write.
Deep Dive into Generators#
You might wonder if Python developers would create a Generator mechanism just for this use case. So, do we have other scenarios for using Generators? Of course! Look at the title; indeed, another significant use of Generators is as coroutines. However, before we delve into that, we need to understand Generators more deeply to facilitate our later discussions.
Built-in Methods of Generators#
Background Knowledge on Iterable Objects in Python#
First, let's look at the iteration process in Python.
In Python, there are two concepts in iteration: Iterable and Iterator. Let's examine them separately.
Firstly, Iterable can be understood as a protocol. The way to determine if an Object is Iterable is to check if it implements iter. If it does, then it can be considered an Iterable object. Let's look at some code to understand this better:
class Counter:
def __init__(self, low, high):
self.current = low
self.high = high
def __iter__(self):
return self
def next(self): # Python 3: def __next__(self)
if self.current > self.high:
raise StopIteration
else:
self.current += 1
return self.current - 1
if __name__ == '__main__':
a = Counter(3, 8)
for c in a:
print(c)
Now, let's see what happens in the code above. First, the for
statement checks whether the object being iterated is an Iterable or an Iterator. If it is an object that implements the __iter__
method, then it is an Iterable object. The for
loop first calls the object's __iter__
method to obtain an Iterator object. So what is an Iterator object? It can be roughly understood as one that implements the next() method (Note: in Python 3, it's the next method).
OK, let's return to what we were discussing. In the code above, the for
statement first determines whether it is an Iterable or an Iterator. If it is an Iterable, it calls its iter method to get an Iterator object, and then the for
loop will call the next()
(Note: in Python 3, it's __next__
) method of the Iterator object to iterate until the iteration ends and raises a StopIteration
exception.
Now, Let's Talk About Generators#
Let's revisit the earlier code:
def generateList1(start, stop):
for i in range(start, stop):
yield i
if __name__ == "__main__":
c = generateList1(1, 100000)
for i in generateList1:
print(i)
First, we need to establish that Generators are also Iterator objects. Let's look at the code above. The for
statement confirms that generateList1
is an Iterator object, and then it starts calling the next()
method for further iteration. At this point, you might wonder how the next()
method allows generateList1
to continue iterating. The answer lies in the built-in send()
method of the Generator. Let's look at another piece of code.
def generateList1(start, stop):
for i in range(start, stop):
yield i
if __name__ == "__main__":
a = generateList1(0, 5)
for i in range(0, 5):
print(a.send(None))
What should we expect as output here? The answer is 0, 1, 2, 3, 4
, which is the same result we would get using a for
loop. Now we can conclude that:
The essence of Generator iteration is to call the built-in
send()
method through the built-innext()
or__next__()
method.
Continuing to Discuss Built-in Methods of Generators#
Earlier, we mentioned a conclusion:
The essence of Generator iteration is to call the built-in
send()
method through the built-innext()
or__next__()
method.
Now let's look at an example:
def countdown(n):
print("Counting down from", n)
while n >= 0:
newvalue = (yield n)
# If a new value got sent in, reset n with it
if newvalue is not None:
n = newvalue
else:
n -= 1
if __name__ == '__main__':
c = countdown(5)
for x in c:
print(x)
if x == 5:
c.send(3)
What should the output of this code be?
The answer is [5, 2, 1, 0]
. Isn't that confusing? Don't worry; let's look at the execution flow of this code.
In short, when we call the send()
function, the value we send(x)
is sent to newvalue
, and execution continues until the next yield
is encountered, returning the value as the end of the process. Our Generator quietly sleeps in memory, waiting for the next send
to wake it up.
Note 2: Someone asked, "I don't quite understand here; is
c.send(3)
equivalent toyield n
returning a 3 tonewvalue
?" Good question! We can see from the execution diagram earlier thatc.send(3)
first assigns3
tonewvalue
, then the program runs the remaining code until it encounters the nextyield
. At this point, the value ofn
has already changed to3
before we reachyield n
, soyield n
is essentially equivalent toreturn 3
. Then, the Generator freezes all variable states and quietly stays in memory, waiting for the nextnext
or__next__()
method orsend()
method to wake it up.
Tip: If we call
send()
directly, make sure to usesend(None)
the first time; only then is a Generator truly activated, allowing us to proceed with the next operation.
Let's Talk About Coroutines#
First, regarding the definition of coroutines, let's look at a segment from Wikipedia:
Coroutines are computer program components that generalize subroutines for nonpreemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations. Coroutines are well-suited for implementing more familiar program components such as cooperative tasks, exceptions, event loops, iterators, infinite lists, and pipes.
According to Donald Knuth, the term coroutine was coined by Melvin Conway in 1958, after he applied it to the construction of an assembly program. The first published explanation of the coroutine appeared later, in 1963.
In short, coroutines are a lighter model than threads, allowing us to control the timing of their start and stop. In Python, there isn't a dedicated concept for coroutines; the community generally views Generators as a special type of coroutine. Think about it: we can wake up our Generator using the next
or __next__()
method or the send()
method, and after executing the code we specified, the Generator returns and freezes all its states. Isn't that exciting?!!
A Little Homework on Generators#
Now we want to perform a post-order traversal of a binary tree. I know that those reading this article can easily write it out, so let's look at the code first:
class Node(object):
def __init__(self, val, left, right):
self.val = val
self.left = left
self.right = right
def visit_post(node):
if node.left:
return visit_post(node.left)
if node.right:
return visit_post(node.right)
return node.val
if __name__ == '__main__':
node = Node(-1, None, None)
for val in range(100):
node = Node(val, None, node)
print(list(visit_post(node)))
However, we know that if the recursion depth is too deep, we might either run out of stack space or fail the Python transaction. OK, the Generator method is good for keeping you safe, so let's look at the code directly:
def visit_post(node):
if node.left:
yield node.left
if node.right:
yield node.right
yield node.val
def visit(node, visit_method):
stack = [visit_method(node)]
while stack:
last = stack[-1]
try:
yielded = next(last)
except StopIteration:
stack.pop()
else:
if isinstance(yielded, Node):
stack.append(visit_method(yielded))
elif isinstance(yielded, int):
yield yielded
if __name__ == '__main__':
node = Node(-1, None, None)
for val in range(100):
node = Node(val, None, node)
visit_generator = visit(node, visit_method=visit_post)
print(list(visit_generator))
Looks complex, right? No worries; consider it homework. Everyone can leave comments, and we can discuss Python transactions together!
Reference Links#
- Improve Your Python: Explaining 'yield' and 'Generators'
- The Power of yield
- http://my.oschina.net/1123581321/blog/160560
- Why Must Python Iterators Implement the iter Method (Regarding iterators, I simplified some things for easier understanding; please refer to the highly-rated answers to this question for specifics.)