HW 5: BSTs and Sorting
- Due: Tue, Nov 21 2023 at 9:00 PM EST
- Released: Wed, Nov 15 2023
The goal of this assignment is to practice working with the Binary Search Tree data structure and with sorting algorithms. In particular, this assignment will reinforce the intertwined nature of algorithms with data structures by having you implement a sorting algorithm with a tree traversal.
You will also get some practice with random testing, and set up for a discussion after November break.
Setup
Create a new folder in VSCode. Copy the BST starter code below into a file named bst.py. You should also create a file named tree_sort.py, which is outlined later in the handout. Please use this implementation, as it will ease our grading and reduce other potential issues.
Staff Assistance
Your friendly staff are here to help with this assignment! You can find our hours schedule here.
The Assignment
Over the course of his career, Tim has given tens of thousands of talks. He tasks you with the responsibility of looking for a way to sort and search for any one of his talks. Immediately, you think of the insertion sorting algorithm and Binary Search Trees (BSTs) that you just learned about in CS0112!
Binary Search Trees
Here is a lightly cleaned up version of the Binary Search Tree implementation we ended up with in class, which you should copy into a file called bst.py
in your new project.
Note that this implementation is more robust in two ways:
- it handles duplicate keys (e.g., the integer
3
might be stored multiple times in the BST); and - it returns the value found (if any) rather than just a boolean (to better support the extension we talked about in class, where the BST stores records, rather than just raw values).
Tree Sort
Copy the following stencil into a file called tree_sort.py
:
In tree_sort.py
, fill in the tree_sort
function which takes in a list and returns a sorted list. Your function must use the tree sort algorithm in two stages:
convert_to_tree
: Create an empty BST and then iterate through the input list, inserting each element one-by-one into the new BST.traversal
: Perform an in-order traversal of the BST (following the algorithm below), inserting every element of the BST one-by-one into a new list.- Return the new list.
Do not use any pre-defined sorting functions. Do not worry about balancing the tree; it may very well be that our implementation will produce an unbalanced tree; that isn’t the focus of this assignment.
To convert your binary search tree into a sorted list, you should perform an in-order traversal of it. This means that when you traverse a node, you first recursively traverse its left child, then visit the node itself, then recursively traverse its right child. Here is pseudocode to follow, based on what we covered in class:
- Create a new empty list.
- If there is a left child, recursively call
traversal
on the left child, and append its result to the list. - If the current node is non-empty, append its value to the list.
- If there is a right child, recursively call
traversal
on it, and append its result to the list. - Return the list.
Unit testing
You should write tests for your tree_sort
function in test_tree_sort.py
as you have for other assignments in this class.
Random Testing
Inspired by the code from Monday’s lecture, create a random testing function for your treesort implementation. Use the code of any sorting algorithm as a baseline for correctness (in class, we used merge sort, but e.g. insertion sort would also be OK). Invoke this random tester from a separate function called test_tree_sort_randomly
in test_tree_sort.py
.
README
In your README.txt
, please include answers to the following questions. Note that there is a third question below, specific to this assignment!
- Did you discuss this assignment with any students? Please list their cs logins.
- How many late days are you using on this assignment?
- Suppose that you were sorting records (i.e., instances of a dataclass with multiple fields, only one of which is the key to be sorted over), rather than just integer keys. What issues might arise in your random testing approach?
For the third question, think carefully about what new complexities records, rather than just values, add. Suppose you were sorting (say) a list of records about people by their ages in ascending order. Does this affect your random testing in any way?
(Be prepared to follow up with a discussion after break.)
Handin
Hand in the following files to Gradescope when submitting the assignment:
bst.py
tree_sort.py
test_tree_sort.py
README.txt