Testing Classes, Inheritance

Class today didn’t exactly reflect these notes; it’s worth skimming them and watching the lecture capture.

Livecode link

Some quick notes:

To tell whether a field is class-specific or instance-specific, just see if it’s defined in the __init__ method (instance-specific) or outside the __init__ methods (class-specific).

Another Example of Polymorphism: str and repr

Python uses polymorphism ubiquitously. For example, recall that dataclass instances printed out in a nice way. How can we get similar behavior with our own classes?

We can implement the __str__ and __repr__ methods:

class Book:
    # other methods...

    # "human readable" string representation
    def __str__(self):
        return self.title + " by " + self.author
    # "unambiguous" string representation
    def __repr__(self):
        return 'Book("' + self.title + '", "' + self.author + '")'

These two methods sound similar, but they get called in different places internally by Python. By convention, __str__ is for end-user focused output, and you want it to be human readable. In contrast, __repr__ should be precise and unambiguous (so I wrote it to produce a string that is, itself, Python code).

If I run print(library[0]), it prints the book’s title, like: A Book Title by A Book Author. But if I’m in the Python console and I just type library[0], it will produce Book("A Book Title", "A Book Author").

Note that this only affects the way Python displays the object; it doesn’t change anything about the object’s internal representation. It’s generally good practice to define at least the __repr__ method in all of your classes. It makes debugging a lot easier when you get an understandable string instead of an object type and ID!

Methods that use the double-underscore notation are usually understood to be special, and interact with some internal Python functionality. By convention we don’t call them directly, but via syntactic sugar they provide. We don’t invoke __init__ to make a new object, we use the ClassName(...) syntax instead. Likewise, we’ll use (e.g.) repr(an_object) and str(an_object).

If you don’t define a __str__, Python will fall back to using __repr__ instead. And some kinds of collections, like lists, may invoke either of these if you call str or repr on them.

Python classes have quite a few of these standard method names. Another is __eq__, which lets you define how an object should decide whether it’s equal to another. We’ll talk about this, and other such methods, later.

Testing Classes

How should we test our library program? We should write tests for each of the methods. Let’s test the Book class:

from library import *

def test_init():
    b = Book("The Nickel Boys", "Colson Whitehead")
    assert b.title == "The Nickel Boys"
    assert b.author == "Colson Whitehead"

def test_matches():
    b = Book("The Nickel Boys", "Colson Whitehead")
    assert b.matches("Nickel")
    assert not b.matches("Parasite")
    assert b.matches("Colson")

Notice the pattern here: we’re creating objects in our test functions in order to test the objects’ methods.

Equality

It’s worth noting, again, that equality is complicated. It’s also used in many different places. For example:

library = [
    Book("The Calculating Stars", "Mary Robinette Kowal"),    
    TVSeries("Guardian", 40, ["Bai Yu", "Zhu Yilong"])
]
print(Book("The Calculating Stars", "Mary Robinette Kowal") in library)

What does this produce? We might expect True (because the two books are identical) or False (because the two books are different objects). When we voted on this question earlier in the semester for lists, we decided that we wanted two different list objects that contained the same elements in the same order to be equal (==) in Python, and that’s what Python does. So what’s going on here?

The problem is that Python has no idea which fields are important to us when deciding equality. Dataclasses automatically define equality to use all fields, but here we need to tell Python exactly what it means for two books to be the same. To tell Python how to interpret equality for Book objects, we’ll define the __eq__ method:

    def __eq__(self, other):
        return self.author == other.author and self.title == other.title

Now the above print statement produces True like we’d expect.

Important note: There’s more to this story that we’ll get to later in the semester. For now, defining equality like this should be OK. Pay special attention to equality if you use your objects as keys in data structures that depend on ideas related to equality. If you use objects as keys in something like a dictionary, which uses the hash table idea, you’ll also need to define the __hash__ method.

Adding checkout

Let’s say we want to add “checkout” behavior to our library: we want to be able to record that certain items have been checked out or returned, and take this into account when searching. Let’s add this behavior to our tests first:

def test_init():
    b = Book("The Nickel Boys", "Colson Whitehead")
    assert b.title == "The Nickel Boys"
    assert b.author == "Colson Whitehead"
    assert not b.checked_out

def test_checkout():
    b = Book("The Nickel Boys", "Colson Whitehead")
    b.library_checkout()
    assert b.checked_out

def test_return():
    b = Book("The Nickel Boys", "Colson Whitehead")
    b.library_checkout()
    b.library_return()
    assert not b.checked_out

def test_matches():
    b = Book("The Nickel Boys", "Colson Whitehead")
    assert b.matches("Nickel")
    assert not b.matches("Parasite")
    assert b.matches("Colson")
    
    b.library_checkout()
    assert b.matches("Nickel")
    assert not b.matches("Parasite")
    assert b.matches("Colson")

Then we can add the behavior to our Book class:

class Book:
    def __init__(self, title: str, author: str):
        self.title = title
        self.author = author
        self.checked_out = False

    def library_checkout(self):
        self.checked_out = True

    def library_return(self):
        self.checked_out = False

   def matches(self, query: str) -> bool:
       return (not self.checked_out) and (query in self.title or query in self.author)

Note that this is a design choice: we could put the check in the class, or we could put the check in the search function. I’d argue that putting the method here, in the class, is actually less preferable: what if a librarian needs to search the library for books, so that they can track down what’s missing? Better to allow the library functions to make that distinction, and not force it in the matches method.

So, if I were writing this again, I’d design it differently.

Inheritance

Right now, we’re considering each class to be totally separate, with no shared code or data. For instance, even though Books and Movies both implement a matches method, the two methods have completely different implementations. Inheritance gives us a way to share code between classes. Classes can inherit from other classes, like this:

class A:
    pass # fill in "parent"

class B(A):  # note the parameter (A)
    pass # fill in "child"

When there’s an inheritance relationship like this between two classes, we say that A is the superclass of B and B is a subclass of A.

Let’s see an example of inheritance in action. Suppose that our library wants to track which items have been checked out, and only return items in a search if they are actually available. First, we’ll implement a LibraryItem class, which we’ll use as the superclass for both books and movies:

class LibraryItem:
    def __init__(self):
        self.checked_out = False

    def library_checkout(self):
        self.checked_out = True
    def library_return(self):
        self.checked_out = False

The LibraryItem class only handles checking items out and back in; it doesn’t know anything about what the items actually are.

In order to use our new parent class, we’ll make some changes to the Book and Movie classes:

class Book(LibraryItem):
    def __init__(self, title: str, author: str):
        self.title = title
        self.author = author
        super().__init__()

class Movie(LibraryItem):
    def __init__(self, title: str, director: str, actors: list):
        self.title = title
        self.director = director
        self.actors = actors
        super().__init__()

(This is one of the few places where the convention above is conventionally violated: we call __init__ directly here, from within a subclass’s __init__.)

Both Book and Movie objects will now automatically have a checked_out field, as well as the checkout method. We can use the field in each class’s matches method:

class Book(LibraryItem):
      # other methods elided...
      def matches(self, query: str) -> bool:
          return (not self.checked_out) and (query in self.title or query in self.author)

class Movie(LibraryItem):
      # other methods elided...
      def matches(self, query: str) -> bool:
          return (not self.checked_out) and
             query in self.title or query in self.director or 
             any([query in actor for actor in self.actors])

…although, again, if I were re-writing these, I’d probably put the check in the search method. Better yet, I’d probably create a Library class to handle the idea of checking books in and out, rather than making the actual item keep track itself.

Inheritance and Exceptions

I’ve released a “bonus content” set of notes that talks about this, filed under October 27th.