Welcome to CSCI 0112

Welcome to CSCI 0112!

Today we’re going to do a little bit of discussion, set the stage for the class, and learn about some historical context.

How to use these notes

Notes in 0112 contain pauses for in-class exercises where appropriate. So if you’re reading these notes without having been in lecture, and you see a collapsible section, don’t immediately expand it. Think about the question first! If you don’t, you’ll be robbing yourself of a chance to participate and learn.

So that you can spot them in the future, exercises will look like the following:

Think, then click!

Answer here!

The answers I give in lecture notes are frequently not the only reasonable answer; in class someone often gives a good answer that wasn’t in my notes! So don’t worry about whether my answer is the same as yours.

All notes in 0112 are written by Tim Nelson (the current instructor). In some cases, they are adapted from material by Doug Woos, Kathi Fisler, Rob Lewis, and others involved with the development of the course.

What is “good” code?

You’ve probably written several programs before—likely in CSCI 0111, although some of you may have had different first courses in programming. So you’re starting to develop a sense of what might make a program “good” or not. What do you think makes a program “good”?

Think, then click!

Frequent answers to this question include:

readability
extensibility
performance
fitness to purpose
positive impact on the world, rather than negative
safety or robustness against error

This semester, we’ll learn how to write programs that are “better” in all of these ways. Since this is a shortened lecture, we won’t dive in very deep today, but we’ll talk about a useful historical example.

Landing Humans on the Moon

(And returning them safely to the earth!)

Margaret Hamilton was the Director of the Software Engineering Division at MIT’s Draper Laboratory in the 60s. Among other things, she was in charge of the team that developed the in-flight software that ran on the Apollo 11 mission, which took humans to the Moon. She was also one of the inventors of the term “software engineering”, which describes the study of how to effectively build reliable software.

How many lines of code would you guess were in the Apollo in-flight system?

Think, then click!

About 145,000.

In CSCI 0111, you mostly wrote programs that were less than 100 lines of code or so (usually much less). When writing a program of that size, you might be able to hold the whole thing in your head at once–the whole thing might even fit on your screen! When you’re writing 145,000 lines of code, that’s probably not possible.

Is this a problem? Why?

Think, then click!

Because software is written by people.

The Apollo 11 team had to be able to reliably think about particular pieces of code in isolation, without worrying about the details of other parts of the code. That’s a lot harder than it may sound. In a large system, it’s very easy for one component to influence the behavior of another. This means that techniques for developing components in isolation are vital.

If the Apollo 11 team hadn’t managed that, lives would have been put at risk.

An example of this kind of technique: functions! When you write a function that does some particular task, the code you write that calls the function doesn’t need to worry about the function’s body. We’ll be talking a lot about other similar ideas in 0112.

What did their code look like, anyway?

By the way, here’s a chunk of the Apollo 11 code we grabbed at random (the source code is available here):

SETWO           TC      WOZERO          # GO SET WORD ORDER CODE TO ZERO.
        +1      CA      DNECADR         # RELOAD A WITH THE DNADR.
        +2      AD      MINB1314        # IS THIS A REGULAR DNADR?
                EXTEND
                BZMF    FETCH2WD        # YES.  (A MUST NEVER BE ZERO)
                AD      MINB12          # NO.  IS IT A POINTER (DNPTR) OR A
                EXTEND                  #       CHANNEL(DNCHAN)
                BZMF    DODNPTR         # IT'S A POINTER.  (A MUST NEVER BE ZERO)

DODNCHAN        TC      6               # (EXECUTED AS EXTEND)  IT'S A CHANNEL
                INDEX   DNECADR
                INDEX   0       -4000   # (EXECUTED AS READ)
                TS      L
                TC      6               # (EXECUTED AS EXTEND)
                INDEX   DNECADR
                INDEX   0       -4001   # (EXECUTED AS READ)
                TS      DNECADR         # SET DNECADR
                CA      NEGONE          #       TO MINUS
                XCH     DNECADR         #               WHILE PRESERVING A.
                TCF     DNTMEXIT        # GO SEND CHANNELS

I can’t completely understand what this code does, because I don’t know the language. But let’s look at it together anyway. We can learn things from it without understanding it. What do you notice about this code?

Think, then click!

Here are a couple of the things I noticed:

Almost every line of code has a comment explaining what it’s doing! (The comments are the bits after the # on each line)
Nevertheless, the code is pretty hard to read. It’s written in a very old, very low-level programming language called AGC assembly language.

Luckily, in this class you’ll be working in Python, not AGC assembly language. Python has built-in support for the kinds of software engineering techniques and concepts we’ll learn in the course, like functions. However, clean code with well-isolated components can be written in any programming language! Modern languages just make it easier.

As an aside: Apollo 11 had 145,000 lines of code. Google has about 2 billion lines of code (as of 2020). Part of what makes that possible is that Google’s code is mostly written in languages like Python, not in AGC assembly language, and can take advantage of modern methods of organization and engineering.

Measure twice, cut once

Carpentry, and other physical engineering disciplines, have a saying: “Measure Twice, Cut Once!”. Why is this so important for them? Cost! If you’re building something, and cut a piece of lumber too short, make the cut in the wrong direction versus the grain, you may have to pay:

a cost in materials (you likely need an entirely new piece of wood); and
the cost in time (all the effort involved in preparing the wood up to that point is lost). So if you find yourself building something, measure twice! Measuring three times would be OK too.

Fortunately, we don’t have the same problem when we’re building software, right? Because code is just data, you never really lose any materials if you make a mistake. And because compilers are pretty fast these days, you lose hardly any time at all.

Stop. Think carefully. It’s very comforting to us as engineers to think that our mistakes won’t have as much impact as they might in other disciplines. We might have even heard similar statements from important, experienced, revered people. The above paragraph might even be correct from a certain point of view.

So: stop and think carefully. Is it actually the case, for you, as a working software engineer, that your mistakes are “free”? Look for the costs.

Think, then click!

If you haven’t kept backups of your code from before you broke it, you might have to re-create it. Even if you’ve got backups, finding the right version might take a lot of work.
If you discover your error late, it’s going to be harder to fix. For a physical-engineering example, see the London Millennium Bridge. For a software example, see literally every entry in this list of historical Windows security bugs.
Is recompiling the fix really the biggest time sink when you have a bug? You’ve fixed bugs before—did any of them that weren’t typos take less time to fix than they took to compile? Very few. Focusing on compilation time is outdated and misleading.
At scale, all these factors become even worse! Different pieces of code, and duplicates of the same code, might be running on different machines around the world.
Is time really the main factor here? We’re humans, and we (at time of writing) are limited by the nature of the human hybrid chemical/electric brain. We get tired, and work less efficiently. We get angry or frustrated, and work less efficiently. We all have a limited amount of executive, creative, social, and analytical energy. Debugging chews up mental resources you could have spent in other ways. In class, I’ll call this resource “brain juice”, and it’s why sometimes you can angrily debug for hours, go to sleep, and solve the problem in 2 minutes after waking up.

There are at least three takeaways. First, bugs aren’t cheap. It’s worth learning to think about how we write code, what we’re trying to accomplish when we do, and approach fixing problems carefully. Second, we might have more in common with physical engineering disciplines than you might at first think. Third, always think critically about glib and comforting generalizations.

We’ll return to this topic from a few angles as the semester continues.

Some Logistics

See the webpage for assignment, etc. due dates. The first assignment goes out today! It’s meant as a review of Python skills, but for some of you it might introduce a couple new concepts. So that you can engage with this, we’ll be grading the first Python homework for participation and engagement; focus on your learning more than points.

Accommodations

You can read more about my accommodations philosophy in the course missive, but I want to emphasize that I mean it. And, especially right now, I want to make sure that all my students can engage with my courses in a healthy way. If that is ever not the case, reach out. I’ll listen.

Notes on Instructor Wrangling

From experience, I know that a bit of radical honesty from an instructor can be useful. So:

I am fully vaccinated, and yet will still sometimes wear a mask. I hope you’re wearing one during shopping period, just to help limit exposure since we’ve all recently been traveling. Today I’m wearing a mask not just for that reason, but because I have a cold! One of the things I hope we learn from the pandemic here in the U.S. is being considerate of others when we’re ill. (If you’re sick, please wear a mask – or don’t come to class. Being considerate of others won’t be penalized.)

I can sometimes have hearing problems in certain classrooms. (This is a hidden reason why I tend to bounce around and get close to people who are asking questions; I’m not just excited to hear you, but I also just want to hear you!) Please don’t be shy about speaking up, and be patient if I ask you to repeat yourself.

I have ADHD. To keep myself from getting too off-track or over time in lectures and office hours, I will sometimes need to cut off great questions or fun discussions. Please forgive me if I do this, and never hold back questions because of it. My lectures (even in a large class) run on interaction, and so I try to strike a balance. And finally, I’m not ignoring your email; please ping me if I don’t answer in a reasonable amount of time—if you want to schedule a meeting, suggest a concrete set of times.

Any questions? Worries? Goals?

I like to spend some time learning about your goals, needs, and worries about the class. Since I don’t know what you’re going to say yet, I can’t put anything in the notes! Instead, I’ll be handing out note cards in person, and have prepared a Google form. In the future, I’ll use note cards for these, but I want to include those who are unable to make it in person today.

Any remaining time in class will be devoted to Q&A about the course.