Python: How to Use a Generator Function
Within the realm of programming, a generator is a routine that is used to control interaction within a loop. Generators are useful when you want to produce a large sequence of values without storing them in memory at once. The ability to create a large sequence without using up memory is important, especially when dealing with Python programs that generate a large amount of information, such as a long sequence of numbers.
Let’s take pi, for example. You might create a Python program that requires the usage of pi to a certain decimal point, which can be rather large. You wouldn’t want to store all of those numbers in memory because you could wind up in a situation where your computer (or the computer the application will run on) might run out of available memory. This is especially so given the fact that pi’s decimals never end. So, what do you do? You create a generator and avoid storing thousands (or possibly millions) of numbers in memory.
According to the Python documentation, “Generator functions allow you to declare a function that behaves like an iterator, i.e. it can be used in a for loop.”
Let’s take a look at a very simple generator function. This small app will print out a sequence of numbers.
To do this, we’re going to use the yield keyword, which returns a generator object (such as an expression) instead of simply returning a value. Our example will use a generator to print out the Fibonacci sequence. (We’ll end with 100 numbers so you don’t wind up with too much output.)
For those who don’t know, this sequence is a series of numbers where each number is the sum of the two preceding numbers — so, 0+1=1, 1+1=2, 1+2=3, 2+3=5, and so on.
This small application will define a generator named fibonacci(n). Our generator looks like this:
def fibonacci(n): a = b = 1 for i in range(n): yield a a, b = b, a + b
The line “a, b = b, a+b” is used to calculate a + b before assigning that sum to b. It’s a well-known tuple unpacking trick that you can read up on in numerous pieces of documentation.
We use yield here because it’s memory efficient (because the execution happens only when the caller iterates over the object).
Next, we create a for loop to print out the Fibonacci sequence to 100 numbers like so:
for x in fibonacci(100): print(x)
The entire app looks like this:
def fibonacci(n): a = b = 1 for i in range(n): yield a a, b = b, a + b for x in fibonacci(100): print(x)
When you run the application, it will print the first 100 numbers in the Fibonacci sequence.
Of course, we can do that without a generator, but we run the risk of using up system resources.
Let’s go back to our pi example. Say you want to create a function that will calculate pi to the nth decimal place, but you don’t want to run the risk of this function consuming all of your system resources. As you might have guessed: For this task, you will need to create a generator.
The first thing we must do is import the sys module, which provides the necessary functions and variables required to manipulate certain parts of the Python runtime environment:
import sys
Next, let’s define our generator, which we’ve called calcPi(), using a while loop:
def calcPi(): q, r, t, k, n, l = 1, 0, 1, 1, 3, 3 while True: if 4*q+r-t < n*t: yield n nr = 10*(r-n*t) n = ((10*(3*q+r))//t)-10*n q *= 10 r = nr else: nr = (2*q+r)*l nn = (q*(7*k)+2+(r*l))//(t*l) q *= k t *= l l += 2 k += 1 n = nn r = nr
Our next section defines a variable pi_digits using the calcPi() generator, then uses a for loop to iterate through the values of pi_digits:
pi_digits = calcPi()
i = 0
for d in pi_digits:
sys.stdout.write(str(d))
i += 1
if i == 50:
print("")
i = 0
The entire program looks like this:
import sys
def calcPi():
q, r, t, k, n, l = 1, 0, 1, 1, 3, 3
while True:
if 4*q+r-t < n*t:
yield n
nr = 10*(r-n*t)
n = ((10*(3*q+r))//t)-10*n
q *= 10
r = nr
else:
nr = (2*q+r)*l
nn = (q*(7*k)+2+(r*l))//(t*l)
q *= k
t *= l
l += 2
k += 1
n = nn
r = nr
pi_digits = calcPi()
i = 0
for d in pi_digits:
sys.stdout.write(str(d))
i += 1
if i == 50:
print("")
i = 0
If you run this application, it will continue printing out pi until you manually stop it. Had you not used a generator for this, the app would very quickly consume all system resources, and you could wind up in a situation where you have to do a hard reboot of the machine.
Not every generator has to include the yield keyword. Take this simple example that produces the squares of the numbers 0 through 4. The app looks like this:
squarenum_generator = (i * i for i in range(5)) for i in squarenum_generator: print(i)
What you see above is a generator object that produces the square numbers from 0 * 0 to 4 * 4, followed by a for loop that iterates over the generator to get generated values. In this example, the generator is the first line.
That’s the true beauty of Python generators. Instead of running the risk of causing systemwide problems, you control the issue by not storing massive amounts of information.