Binary Search Trees
CS 66: Introduction to Computer Science II
References for this lecture
Problem Solving with Algorithms and Data Structures using Python
Sections 7.2 - 7.6, 7.11 - 7.14: https://runestone.academy/runestone/books/published/pythonds/Trees/toctree.html
Trees
A tree data structure is a collection of nodes that express a hierarchical relationship.
You can think of it like a linked list, but each node may be linked to more than one other node.
Example: A tree representing an animal taxonomy
Vocabulary
Node: the circles - may contain a key and other information
Edge: arrow representing relationship between two nodes
Parent/Child: the edge goes out from the parent node and into the child node. (Mammal is the parent of Carnivora, Carnivora is the child of Mammal)
Root: node with no parent (Animalia)
Leaf: nodee with no children (Human, Chimpanzee, etc.)
Example: A tree representing the folder structure on a computer
Example: Natural langauge parse tree
Parse tree for the sentence “Homer hit Bart”
Example: Expression Parse Tree
Parse tre for the expression $((7+3)*(5−2))$
Binary Search Tree
A Binary Search Tree is a tree that allows a search operation to be done similar to a binary search.
Like hash tables, they are commonly used to implement the set and map abstract data types.
We will work through the implementation of a set - remember that a set is a collection of values with no duplicates and no implied ordering.
Set operations: add, search, remove
Binary Search Tree Nodes
BST Nodes have at most 2 children, called left child and right child
Nodes in left subtree have value less than this node
Nodes in right subtree have value more than this node
This is true for every node!
Designing the BST Node
Here’s what our Linked List node looked like:
class Node:
def __init__(self,initdata):
self.data = initdata
self.next = None
def getData(self):
return self.data
def getNext(self):
return self.next
def setData(self,newdata):
self.data = newdata
def setNext(self,newnext):
self.next = newnext
Group Activity Problem 1
Design a node for a binary search tree
class BSTNode:
def __init__(self,initdata):
#fill in the rest
#add any methods that you think will be helpful
#we could create a new node like this
mynode = BSTNode(42)
Adding items to the BST
BSTs allow for good use of recursion
We can create an add method right inside the BST Node class
Here’s the pseudocode
- if the new item is less than this node’s value
- if this node doesn’t have a left child, create a new BST node and make it this node’s left child
- otherwise, recursively add the item to the left subtree
- if the new item is greater than this node’s value
- if this node doesn’t have a right child, create a new BST node and make it this node’s right child
- otherwise, recursively add the item to the right subtree
Group Activity Problem 2
Implement the add method in the BST Node class using this algorithm.
Discuss: Is it possible to create an empty tree with the class we’ve written?
#put this method into BSTNode class
def add(self,new_val):
#implement the code here
Group Activity Problem 3
Discuss: Is it possible to create an empty tree with the class we’ve written?
Creating the BSTSet Class
Let’s now create a class to represent a whole tree made up of BST Nodes
- all of the real work is being done by the
BSTNodeclass - we need to keep track of the root node
- we need to write methods that call the
BSTNodemethods
class BSTSet:
def __init__(self):
self.__root = None #start with an empty tree!
def add(self,new_val):
if self.__root == None:
self.__root = BSTNode(new_val)
else:
self.__root.add(new_val)
myset = BSTSet()
myset.add(4)
myset.add(7)
myset.add(2)
myset.add(9)
Group Activity Problem 4
Implement a search method in the BSTNode class - again use recursion.
Write the associated method in BSTSet (suggestion: call it __contains__ in BSTSet so that you can use it with the in operator.
myset = BSTSet()
myset.add(4)
myset.add(7)
myset.add(2)
myset.add(9)
print( 7 in myset ) #should be True
print( 1 in myset ) #should be False
Group Activity Problem 5
What would the tree look like in a worst-case search? So then what is the computational complexity of your search? What about add?
After you a good guess, check out https://runestone.academy/ns/books/published/pythonds/Trees/SearchTreeAnalysis.html to see if you agree with the book.
Removing Items
Removing nodes is the most difficult part.
Three cases to consider:
- The node has no children
- set the reference to it from the parent to None
- The node has 1 child
- set the reference from the parent to the 1 child
- The node has 2 children
- replacee this node’s value with that of its successor
- delete the node that the successor value came from
Other Considerations
- the code will look different depending on whether the node being removed is itself a left child, a right child, or the root
- we haven’t been keeping track of the number of items - how could wee do that
- we might be able to make a BST to implement an ordered list ADT (defind here: https://runestone.academy/ns/books/published/pythonds/BasicDS/TheOrderedListAbstractDataType.html ).
- The main difficulty is implementing an index method - can you find the number of items in the tree that are less than a given item?