CS 66: Introduction to Computer Science II
Please fill this out
https://forms.gle/E6FKsKtvDkw1QeQt7
Reminder - your name will be removed before we look at your answers
Problem Solving with Algorithms and Data Structures using Python
Sections 7.11 - 7.14: https://runestone.academy/runestone/books/published/pythonds/Trees/toctree.html
BST Nodes have at most 2 children, called left child and right child
Nodes in left subtree have value less than this node
Nodes in right subtree have value more than this node
This is true for every node!
Here's the code we came up with in class last time
class BSTNode:
def __init__(self,initdata):
self.data = initdata
self.left = None
self.right = None
def __repr__(self):
return repr(self.data)+"("+repr(self.left)+","+repr(self.right)+")"
def __str__(self):
str_rep = ""
if self.left != None:
str_rep += " "+str(self.left)
str_rep += " "+str(self.data)
if self.right != None:
str_rep += " "+str(self.right)
return str_rep
def getData(self):
return self.data
def setData(self,newdata):
self.data = newdata
def getLeft(self):
return self.left
def setLeft(self,newleft):
self.left = newleft
def getRight(self):
return self.right
def setRight(self,newright):
self.right = newright
#put this method into BSTNode class
def add(self,new_val):
#implement the code here
if new_val < self.data:
if self.left == None:
new_node = BSTNode(new_val)
self.setLeft( new_node )
else:
self.getLeft().add(new_val)
elif new_val > self.data:
if self.right == None:
new_node = BSTNode(new_val)
self.setRight( new_node )
else:
self.getRight().add(new_val)
else:
raise Exception("trying to add duplicate "+str(new_val))
class BSTSet:
def __init__(self):
self.__root = None #start with an empty tree!
def add(self,new_val):
if self.__root == None:
self.__root = BSTNode(new_val)
else:
self.__root.add(new_val)
def __repr__(self):
return repr(self.__root)
def __str__(self):
return str(self.__root)
myset = BSTSet()
myset.add(4)
myset.add(7)
myset.add(2)
myset.add(9)
print(repr(myset))
print(str(myset))
4(2(None,None),7(None,9(None,None))) 2 4 7 9
Implement a search method in the BSTNode class - again use recursion.
Write the associated method in BSTSet (suggestion: call it __contains__ in BSTSet so that you can use it with the in operator.
myset = BSTSet()
myset.add(4)
myset.add(7)
myset.add(2)
myset.add(9)
print( 7 in myset ) #should be True
print( 1 in myset ) #should be False
What would the tree look like in a worst-case search? So then what is the computational complexity of your search? What about add?
After you a good guess, check out https://runestone.academy/ns/books/published/pythonds/Trees/SearchTreeAnalysis.html to see if you agree with the book.
Removing nodes is the most difficult part.
Three cases to consider:
In the worst case, a BST could be completely skewed in one direction
All operations: $O(n)$
Inventors: G.M. Adelson-Velskii and E.M. Landis
Idea: balance factor is only allowed to be 0, 1, or -1
Approach: Whenever you add or remove items, if the balance factor is > 1 or < -1, perform rotations to get it back in balance (check all nodes back up to the root)
Result: The tree will bee guaranteed to have a height of at most $1.44*\log_2 N$
add, search, and remove will run in $O(\log N)$ time
In this example we
After left rotation:
Solution
$balanceFactor = -2$ and right child's $balanceFactor=1$, do right rotation on right child then left rotation at this node
$balanceFactor = 2$ and left child's $balanceFactor=-1$, do left rotation on left child then right rotation at this node
Which data structure should you use to implement a Set?
Note: Maps are similar https://runestone.academy/ns/books/published/pythonds/Trees/SummaryofMapADTImplementations.html
Operation |
Sorted List |
Hash Table* |
Binary Search Tree |
AVL Tree |
|---|---|---|---|---|
add |
\(O(n)\) |
\(O(1)\) |
\(O(n)\) |
\(O(\log_2{n})\) |
in (search) |
\(O(\log_2{n})\) |
\(O(1)\) |
\(O(n)\) |
\(O(\log_2{n})\) |
remove |
\(O(n)\) |
\(O(1)\) |
\(O(n)\) |
\(O(\log_2{n})\) |
Is it ever better to use an AVL Tree instead of a Hash Table as the underlying data structure? Are there memory differences? What happens if you are adding and removing items a lot? What happens when the hash table fills up?