Designing Object-Oriented Programs

These notes are in draft form, and are released for those who want to skim them in advance. Tim is likely to make changes before and after class.

Livecode link Old Livecode link

Animal classes

Let’s (pretend to) use Python to model some ecological ideas.

Consider four concepts that we will model using classes:

Types of objects can stand in different kinds of relations to each other. One kind is an “is-a” relationship: a tiger is an animal, and a moose is an animal. In Python we typically represent this relationship using subclasses: we would define a superclass Animal that is extended by the Tiger and Moose subclasses.

class Animal:
  pass 

class Tiger(Animal):
  pass 

class Moose(Animal): 
  pass

Where does a habitat fit into this picture? An animal isn’t a habitat, and a habitat isn’t an animal. There’s functionality that a habitat has that an animal doesn’t, and vice versa.

But there’s still a connection: habitats “contain” animals. We can call this a “has-a” relationship. These tend to be represented by data fields in an object.

class Habitat:
  def __init__(self):
    self.animals = []

Car classes

We can play the same game with another list of concepts:

Every convertible is a car, so this is an is-a relationship. And every car has an engine. (Transitively, every convertible has an engine.)

class Engine:
  pass

class Car:
  def __init__(self):
    self.engine = Engine()

class Convertible(Car):
  def __init__(self):
    super().__init__()

miata = Convertible()
print(miata.engine)

Library classes

For yet another example, consider the Library code we’ve been writing. Books and TVSeries are “library items”, so they are subclasses of LibraryItem. But if we define a Library class that contains a list of library items, the picture gets more complicated. What is the has-a relationship? A Library has library items, but each library item is also associated with some library, so the arrow could go the other direction. How do we represent this in our code? It depends what we want to model! Depending on how our code will be applied, we may want a library to track its items, a library item to track its library, or perhaps both.

Tracking a bi-directional “has-a” relationship can be a bit tricky. Consider this code:

class Library:
    def __init__(self):
        self.items = []

    def add(self, item):
        self.items.append(item)

    def search(self, query: str) -> list:
        return [item for item in self.items if item.matches(query)]

class LibraryItem:
    def __init__(self, library: Library):
        self.checked_out = False
        self.library = library

    def checkout(self):
        self.checked_out = True

class Book(LibraryItem):
    def __init__(self, title: str, author: str, library: Library):
        self.title = title
        self.author = author
        super().__init__(library)

    def matches(self, query: str) -> bool:
        return (not self.checked_out) and (query in self.title or query in self.author)

class Movie(LibraryItem):
    def __init__(self, title: str, director: str, actors: list, library: Library):
        self.title = title
        self.director = director
        self.actors = actors
        super().__init__(library)

    def matches(self, query: str) -> bool:
        return (not self.checked_out) and \
            query in self.title or query in self.director or \
            any([query in actor for actor in self.actors])

prov_pub = Library()
cc = Book("Cat's Cradle", "Kurt Vonnegut", prov_pub)

We’ve created a book that belongs to the Providence Public Library, but the library doesn’t know about it yet! This isn’t exactly broken code, but it could get very confusing – it’s a reasonable assumption for us to make as developers that a book belongs to a library if and only if that book is in that library’s list of items. We might write code that assumes this and will break when it comes across this orphaned copy of Cat’s Cradle. So we need to be careful, either not to break this symmetry, or not to assume that this symmetry holds.

More generally we call properties that we want to assume always hold, and may rely on that in our programming, invariants.

How could we fix this problem in the above code?

Think, then click!

We could add the book to the library in the book’s __init__ method. We’d have to be careful, though: equality is important here. For example, do we like the idea of the library being able to have multiple copies of the same book? Then using a list rather than a set to store the library’s contents is a good idea. But then what should we do if someone manually adds the same book object later—do we want to allow duplicates of the same physical book?

Even more, we should avoid directly setting or modifying fields of another class; ideally the class would provide a method to make that change instead. Why is this important? Suppose that a library needed to keep its catalog sorted. A book or movie etc. wouldn’t necessarily know that, and indeed shouldn’t have to!. Making the addition via a method allows the library to perform any post-processing, repair, access control, etc. it might need to.

There are other concerns, too, such as the fact that we now require every book to belong to a library. Thinking through these use cases is a major part of object-oriented design (and software design in general). ## Space classes For one final inheritance exercise: let's figure out the relationships between the terms * Star * Planet * Gas giant * Galaxy * Spiral Galaxy * Earth * Jupiter
Think, then click! The answer isn't unique. One reasonable way to do it would be: A spiral galaxy is a galaxy. A gas giant is a planet. Earth is a planet. Jupiter is a gas giant. A galaxy has stars and planets. A star has planets (that orbit it). But let's focus on the lines "Earth is a planet. Jupiter is a gas giant." Again, depending on what exactly we're trying to model, this could be interpreted in _different ways_: ```python class Planet: pass class GasGiant(Planet): pass class Earth(Planet): pass class Jupiter(GasGiant): pass ``` or, instead, ```python class Planet: pass class GasGiant(Planet): pass earth = Planet() jupiter = Planet() ``` A class describes a _kind_ of thing. So: * if we wanted to model multiple versions of Earth that had different kinds of pertinent data, or provided different kinds of interaction via methods, we'd create separate classes for them (say, `Earth` and `AlternateEarth`; but * if we wanted to model multiple versions of earth that had the same structure but perhaps very different underlying data -- say, Earth now vs. a billion years ago -- we could create two objects of the `Earth` (or even `Planet`) class.
## Dataclasses One last note on setting up classes. We used the `@dataclass` feature for a while, and then slowly dropped it for more flexible plain `class`es. But using `@dataclass` does some nice things for us. We can display them easily (because `__str__` and `__repr__` are automatically defined), and also compare them (because `__eq__` is defined): two objects of the same dataclass are equal exactly when _all_ of their fields are equal. As a general heuristic, it makes sense to use dataclasses to store "records," pieces of key-value data that you might use a dictionary to store, but want a bit more structure (e.g., no accidentally referencing the wrong field). Frozen dataclasses (`@dataclass(frozen=True)`) are especially useful for representing immutable data, even if you don't plan on placing them in a `set` or use them as keys in a `dict`. In our library example: it might make sense to track authors, directors, etc. using a `Person` dataclass that stores a name, year of birth, etc. This is static information. But it would make less sense for a `Library` to be a `@dataclass` of the sort we've been using, because we need more customization.