Testing and Memory

Today we’re going to have the talk about testing we didn’t get to last time. We’ll use that to frame a few useful discussions about what happens when we run a Python program.

More About Testing

Let’s write some tests for count_words. Since my lecture livecode from last time lives in lecture03.py, we’ll put testing code in a file called test_lecture03.py. At the top of that file, we can put: from lecture03 import *.

We’ll see more examples of this later. For now, the only thing to remember is that this lets us refer to all of the functions we have written in lecture03.py. That will be useful if we want to test them.

The convention we’ll use for this class (and which is often used in practice) is each function in the file we’re testing (wordcount.py) corresponds to a function in the test file whose name starts with test_. So, we’ll write tests like this:

def test_count_words():
   pass

What would be a good set of test cases for count_words? Remember that it takes a string and returns a dictionary.

Think, then click!

Here’s an example set of tests:

def test_count_words():
   assert count_words("") == {}
   assert count_words("word") == {"word": 1}
   assert count_words("word word") == {"word": 2}
   assert count_words("word other word") == {"word": 2, "other": 1}


Of course, unless we actually call test_count_words, the tests will never be run! We could do that explicitly, like last time, but today we’ll start using a more professional tool called pytest.

You might already have pytest installed! You can test by running it (instead of python3) on your test file. You can also run it by typing python3 -m pytest; that’s what I’ll be doing in class. If you don’t have pytest installed, there are a few ways to get it. One is to type python3 -m pip install pytest into the terminal; we give more instructions in our setup guide. If you experience issues with this, definitely reach out!

Data Structures and Memory

Let’s talk a bit more about how Python sets, etc. work. If I create a new list, and then immediately update it, like so:

dry_ingredients = ['flour', 'baking powder']
dry_ingredients = dry_ingredients + ['cinnamon']

The contents of dry_ingredients have changed! The list now contains ['flour', 'baking powder', 'cinnamon'].

Sets work the same: they aren’t “atomic” values like integers; they are storage containers for collections. This can sometimes have unintended consequences! Let’s look at this Python program:

recipes = {'pbj': {'peanut butter', 'jelly', 'bread'},
           'smoothie': {'peanut butter', 'banana', 'oat milk'}}
chocolate_smoothie = recipes['smoothie']
chocolate_smoothie.add('cocoa powder')
berry_smoothie = recipes['smoothie'] | {'berries'}

What do all of these collections look like at the end of the program? (Try it!)

Think, then click!
recipes={'pbj': {'bread', 'jelly', 'peanut butter'}, 'smoothie': {'cocoa powder', 'oat milk', 'banana', 'peanut butter'}}
chocolate_smoothie={'cocoa powder', 'oat milk', 'banana', 'peanut butter'}
berry_smoothie={'cocoa powder', 'oat milk', 'peanut butter', 'banana', 'berries'}

Notice that recipes has changed!

To help understand what’s going on, let’s draw a picture. Things like sets and dictionaries and lists in Python live someplace in memory. When we give a name to, say, a dictionary (saying recipes = ...) Python assigns the name to a reference to that place in memory. In this picture, we see recipes pointing to the actual dictionary table, stored in memory. Similarly, since the dictionary’s values are sets, those also have their own location in memory, which the dictionary stores.

Two Possibilities

The question is, what does chocolate_smoothie refer to, once we assign the name? There are two possibilities: either it refers to a new copy of the set (a new identity entirely!), or to the already-existing set that recipes refers to (an alias for an old identity!). Which do you think it is?