Trees
Welcome!
This page is the main page for the Trees Section. This section has chapters which cover:
- Basic Trees
- Recursive Trees
- Tries
- Binary Trees
This page is the main page for the Trees Section. This section has chapters which cover:
This page is the main page for Introduction to Trees
For the next data structure in the course, we will cover trees, which are used to show hierarchical data. Trees can have many shapes and sizes and there is a wide variety of data that can be organized using them. Real world data that is appropriate for trees can include: family trees, management structures, file systems, biological classifications, anatomical structures and much more.
We can look at an example of a tree and the relationships they can show. Consider this file tree; it has folders and files in folders.
If we wanted to access the file elm.txt
, we would have to follow this file path: Foo/B/Q/E/elm.txt
. We can similarly store the file structure as a tree like we below. As before, if we wanted to get to the file elm.txt
we would navigate the tree in the order: Foo -> B -> Q -> E -> elm.txt
. As mentioned before, trees can be used on very diverse data sets; they are not limited to file trees!
In the last module we talked about strings which are a linear data structure. To be explicit, this means that the elements in a string form a line of characters. A tree, by contrast, is a hierarchal structure which is utilized best in multidimensional data. Going back to our file tree example, folders are not limited to just one file, there can be multiple files contained in a single folder- thus making it multidimensional.
Consider the string “abc123”; this is a linear piece of data where there is exactly one character after another. We can use trees to show linear data as well.
While trees can be used for linear data, it would be excessive and inefficient to implement them for single strings. In an upcoming module, we will see how we can use trees to represent any number of strings! For example, this tree below contains 7 words: ‘a’, ‘an’, ‘and’, ‘ant’, ‘any’, ‘all’, and ‘alp’.
In the next sections, we will discuss the properties of a tree data structure and how we would design one ourselves. Once we have a good understanding of trees and the properties of trees, we will implement our own.
To get ourselves comfortable in working with trees, we will outline some standard vocabulary. Throughout this section, we will use the following tree as a guiding example for visualizing the definitions.
Node
- the general term for a structure which contains an item, such as a character or even another data structure.Edge
- the connection between two nodes. In a tree, the edge will be pointing in a downward direction.
This tree has five edges and six nodes. There is no limit to the number of nodes in a tree. The only stipulation is that the tree is fully connected
. This means that there cannot be disjoint portions of the tree. We will look at examples in the next section.
A rule of thumb for discerning trees is this: if you imagine holding the tree up by the root and gravity took effect, then all edges must be pointing downward. If an edge is pointing upward, we will have a cycle
within our structure so it will not be a tree.
Root
- the topmost node of the tree
To be a tree, there must be exactly one root. Having multiple roots, will result in a cycle
or a tree that is not fully connected
. In short, a cycle
is a loop in the tree.
Parent
- a node with an edge that connects to another node further from the root. We can also define the root of a tree with respect to this definition; Root
: a node with no parent.Child
- a node with an edge that connects to another node closer to the root.
cycle
will occur. If there is a node without a parent node, then this is necessarily the root node. There is no limit to the number of child nodes a parent node can have, but to be a parent node, the node must have at least one child node.Leaf
- a node with no children.
This tree has four leaves. There is no limit to how many leaves can be in a tree.Degree
Degree of a node
- the number of children a node has. The degree of a leaf is zero.Degree of a tree
- the number of children the root of the tree has.
The degree of the nodes are shown as the values on the nodes in this example. The degree of the tree is equal to the degree of the root. Thus, the degree for this tree is 2.tree
is defined recursively. This means that each child of the root is the root of another tree and the children of those are roots to trees as well. Again, this is a recursive definition so it will continue to the leaves. The leaves are also trees with just a single node as the root.Trees can come in many shapes and sizes. There are, however some constraints to making a valid tree.
Any combination of the following represents a valid tree:
Below are some examples of invalid trees.
A cycle Again, cycles are essentially loops that occur in our tree. In this example, we see that our leaf has two parents. One way to determine whether your data structure has a cycle is if there is more than one way to get from the root to any node.
A cycle Here we can see another cycle. In this case, the node immediately after the root has two parents, which is a clue that a cycle exists. Another test
Two Roots Trees must have a single root. In this instance, it may look like we have a tree with two roots. Working through this, we also see that the node in the center has two parents.
Two Trees This example would be considered two trees, not a tree with two parts. In this figure, we have two fully connected components. Since they are not connected to each other, this is not a single tree.
Along with understanding how trees work, we want to also be able to implement a tree of our own. We will now outline key components of a tree class.
Recall that trees are defined recursively so we can build them from the leaves up where each leaf is a tree itself. Each tree will have three properties: the item it contains as an object, its parent node of type MyTree, and its children as a list of MyTrees. Upon instantiation of a new MyTree, we will set the item value and initialize the parent node to None and the children to an empty list of type MyTree.
Suppose that we wanted to construct the following tree. We would start by initializing each node as a tree with no parent, no children, and the item in this instance would be the characters. Then we build it up level by level by add the appropriate children to the respective parent.
Disclaimer: This implementation will not prevent all cycles. In the next module, we will introduce steps to prevent cycles and maintain true tree structures.
In this method, we will take a value as input and then check if that value is the item of a child of the current node. If the value is not the item for any of the node’s children then we should return none.
function FINDCHILD(VALUE)
FOR CHILD in CHILDREN
IF CHILD's ITEM is VALUE
return CHILD
return NONE
end function
Each of these will be rather straight forward; children, item, and parent are all attributes of our node, so we can have a getter function that returns the respective values. The slightly more involved task will be getting the degree. Recall that the degree of a node is equal to the number of children. Thus, we can simply count the number of children and return this number for the degree.
We will have two functions to check the node type: one to determine if the node is a leaf and one to determine if it is a root. The definition of a leaf is a node that has no children. Thus, to check if a node is a leaf, we can simply check if the number of children is equal to zero. Similarly, since the definition of a root is a node with no parent, we can check that the parent attribute of the node is None.
When we wish to add a child, we must fisrt make sure we are able to add the child.
MyTree
We will return true if the child was successfully added and false otherwise while raising the appropriate errors.
function ADDCHILD(CHILD)
IF CHILD has PARENT
throw exception
IF CHILD is CHILD of PARENT
return FALSE
ELSE
append CHILD to PARENT's children
set CHILD's parent to PARENT
return TRUE
end function
As an example, lets walk through the process of building the tree above:
a
with item ‘A’b
with item ‘B’c
with item ‘C’d
with item ‘D’e
with item ‘E’f
with item ‘F’g
with item ‘G’h
with item ‘H’i
with item ‘I’g
to tree d
h
to tree d
i
to tree d
e
to tree b
f
to tree b
Once we have completed that, visually, we would have the tree above and in code we would have:
a
with parent_node = None, item = ‘A’, children = {b
,c
,d
}b
with parent_node = a
, item = ‘B’, children = {e
,f
}c
with parent_node = a
, item = ‘C’, children = { }d
with parent_node = a
, item = ‘D’, children = {g
,h
,i
}e
with parent_node = b
, item = ‘E’, children = { }f
with parent_node = b
, item = ‘F’, children = { }g
with parent_node = d
, item = ‘G’, children = { }h
with parent_node = d
, item = ‘H’, children = { }i
with parent_node = d
, item = ‘I’, children = { }Note: When adding a child we must currently be at the node we want to be the parent. Much like when you want to add a file to a folder, you must specify exactly where you want it. If you don’t, this could result in a wayward child.
In the case of removing a child, we first need to check that the child we are attempting to remove is an instance of MyTree
. We will return true if we successfully remove the child and false otherwise.
function REMOVECHILD(CHILD)
IF CHILD in PARENT'S children
REMOVE CHILD from PARENT's children
SET CHILD's PARENT to NONE
return TRUE
ELSE
return FALSE
end function
As with adding a child, we need to ensure that we are in the ‘right place’ when attempting to remove a child. When removing a child, we are not ’erasing’ it, we are just cutting the tie from parent to child and child to parent. Consider removing d
from a
. Visually, we would have two disjoint trees, shown below:
In code, we would have:
a
with parent_node = None, item = ‘A’, children = {b
,c
}b
with parent_node = a
, item = ‘B’, children = {e
,f
}c
with parent_node = a
, item = ‘C’, children = { }d
with parent_node = None, item = ‘D’, children = {g
,h
,i
}e
with parent_node = b
, item = ‘E’, children = { }f
with parent_node = b
, item = ‘F’, children = { }g
with parent_node = d
, item = ‘G’, children = { }h
with parent_node = d
, item = ‘H’, children = { }i
with parent_node = d
, item = ‘I’, children = { }In this module we have introduce vocabulary related to trees and what makes a tree a tree. To recap, we have introduced the following:
Child
- a node with an edge that connects to another node closer to the root.Degree
Degree of a node
- the number of children a node has. The degree of a leaf is zero.Degree of a tree
- the number of children the root of the tree has.Edge
- connection between two nodes. In a tree, the edge will be pointing in a downward direction.Leaf
- a node with no children.Node
- the general term for a structure which contains an item, such as a character or even another data structure.Parent
- a node with an edge that connects to another node further from the root. We can also define the root of a tree with respect to this definition;Root
- the topmost node of the tree; a node with no parent.Now we will work on creating our own implementation of a tree. These definitions will serve as a resource to us when we need refreshing on meanings; feel free to refer back to them as needed.
This page is the main page for Tree Traversal
In the last module, we covered the underlying vocabulary of trees and how we can implement our own tree. To recall, we covered: node, edge, root, leaf, parent, child, and degree.
For this module we will expand on trees and gain a better understanding of how powerful trees can be. As before, we will use the same tree throughout the module for a guiding visual example.
Many of the terms used in trees relate to terms used in family trees. Having this in mind can help us to better understand some of the terminology involved with abstract trees. Here we have a sample family tree.
Ancestor
- The ancestors of a node are those reached from child to parent relationships. We can think of this as our parents and our parent’s parents, and so on.
Descendant
- The descendants of a node are those reached from parent to child relationships. We can think of this as our children and our children’s children and so on.
Siblings
- Nodes which share the same parent
A recursive program is broken into two parts:
In principle, the recursive case breaks the problem down into smaller portions until we reach the base case. Recursion presents itself in many ways when dealing with trees.
Trees are defined recursively with the base case being a single node. Then we recursively build the tree up. With this basis for our trees, we can define many properties using recursion rather effectively.
We can describe the sizes of trees and position of nodes using different terminology, like level, depth, and height.
Level
- The level of a node characterizes the distance between the node and the root. The root of the tree is considered level 1. As you move away from the tree, the level increases by one.
Depth
- The depth of a node is its distance to the root. Thus, the root has depth zero. Level
and depth
are related in that: level = 1 + depth
.
Height of a Node
- The height of a node is the longest path to a leaf descendant. The height of a leaf is zero.
Height of a Tree
- The height of a tree is equal to the height of the root.
When working with multidimensional data structures, we also need to consider how they would be stored in a linear manner. Remember, pieces of data in computers are linear sequences of binary digits. As a result, we need a standard way of storing trees as a linear structure.
Path
- a path is a sequence of nodes and edges, which connect a node with its descendant. We can look at some paths in the tree above:
Q
to O
: QRO
Traversal
is a general term we use to describe going through a tree. The following traversals are defined recursively.
Pre
refers to the root, meaning the root goes before the children.QWYUERIOPTA
Post
refers to the root, meaning the root goes after the children.YUWEIOPRATQ
When we talk about traversals for general trees we have used the phrase ’the traversal could result in’. We would like to expand on why ‘could’ is used here. Each of these general trees are the same but their traversals could be different. The key concept in this is that for a general tree, the children are an unordered set of nodes; they do not have a defined or fixed order. The relationships that are fixed are the parent/child relationships.
Tree | Preorder | Postorder |
---|---|---|
Tree 1 | QWYUERIOPTA |
YUWEIOPRATQ |
Tree 2 | QETARIOPWUY |
EATIOPRUYWQ |
Tree 3 | QROPITAEWUY |
OPIRATEUYWQ |
Again, we want to be able to implement a working version of a tree. From the last module, we had functions to add children, remove children, get attributes, and instantiate MyTree. We will now build upon that implementation to create a true tree.
A recursive program is broken into two parts:
Recall that in the previous module, we were not yet able to enforce the no cycle rule. We will now enforce this and add other tree functionality.
Disclaimer: In the previous module we had a disclaimer that stated our implementation would not prevent cycles. The following functions and properties will implement recursion. Thus, we can maintain legal tree structures!
In the first module, we discussed how we can define trees recursively, meaning a tree consists of trees. We looked at the following example. Each red dashed line represented a distinct tree, thus we had five trees within the largest tree making six trees in total.
We will use our existing implementation from the first module. Now to make our tree recursive, we will include more getter functions as well as functions for traversals and defining node relationships.
We can define each of these recursively.
YouTube VideoDepth
- The depth of a node is its distance to the root. Thus, the root has depth zero.We can define the depth of a node recursively:
function GETDEPTH()
if ROOT
return 0
else
return 1 + PARENT.GETDEPTH()
end function
Height of a Node
- The height of a node is the longest path to a leaf descendant. The height of a leaf is zero.We can define the height of a node recursively:
function GETHEIGHT()
if LEAF
return 0
else
MAX = 0
for CHILD in CHILDREN
CURR_HEIGHT = CHILD.GETHEIGHT()
if CURR_HEIGHT > MAX
MAX = CURR_HEIGHT
return 1 + MAX
end function
Root
- the topmost node of the tree; a node with no parent.We can define returning the root recursively:
function GETROOT()
if ISROOT()
return this tree
else
return PARENT.GETROOT()
end function
We define the size of a tree as the total number of children.
function GETSIZE()
SIZE = 1
for CHILD in CHILDREN
SIZE += CHILD.GETSIZE()
return SIZE
end function
To find a value within our tree, we will traverse down a branch as far as we can until we find the value. This will return the tree that has the value as the root.
function FIND(VALUE)
if ITEM is VALUE
return this node
for CHILD in CHILDREN
FOUND = CHILD.FIND(VALUE)
if FOUND is not NONE
return FOUND
return NONE
end function
We can determine many relationships within the tree. For example, given a node is it an ancestor of another node, a descendant, or a sibling?
YouTube VideoFor this function, we are asking: is this node an ancestor of the current instance? In this implementation, we will start at our instance and work down through the tree trying to find the node in question. With that in mind, we can define this process recursively:
function ISANCESTOR(TREE)
if at TREE
return true
else if at LEAF
return false
else
for CHILD in CHILDREN
FOUND = CHILD.ISANCESTOR(TREE)
if FOUND
return true
return false
end function
For this function, we are asking: is this node a descendant of the current instance? In this implementation, we will start at our instance and work up through the tree trying to find the node in question. With that in mind, we can define this process recursively:
function ISDESCENDANT(TREE)
if at TREE
return true
else if at ROOT
return false
else
return PARENT.ISDESCENDANT(TREE)
end function
For this function, we are asking: is this node a sibling of the current instance? To determine this, we can get the parent of the current instance and then get the parents children. Finally, we check if the node in question is in that set of children.
function ISSIBLING(TREE)
if TREE in PARENT's CHILDREN
return true
else
return false
end function
In any tree, we can say that the root is a common ancestor to all of the nodes. We would like to get more information about the common ancestry of two nodes. For this function, we are asking: which node is the first place where this instance and the input node’s ancestries meet? Similar to our ISDESCENDANT, we will work our way up the tree to find the point where they meet
function LOWESTANCESTOR(TREE)
if at TREE
return TREE
else if ISANCESTOR(TREE)
return instance
else if at ROOT
return NONE
else
return PARENT.LOWESTANCESTOR(TREE)
end function
This function will generate the path which goes from the root to the current instance.
function PATHFROMROOT(PATH)
if NOT ROOT
PARENT.PATHFROMROOT(PATH)
append ITEM to PATH
end function
In this module we have talked about two traversals: preorder and postorder. Both of these are defined recursively and the prefix refers to the order of the root.
In a preorder traversal, first we access the root and then run the preorder traversal on the children.
function PREORDER(RESULT)
append ITEM to RESULT
FOR CHILD in CHILDREN
CHILD.PREORDER(RESULT)
end function
In a postorder traversal, first we run the postorder traversal on the children then we access the root.
function POSTORDER(RESULT)
FOR CHILD in CHILDREN
CHILD.POSTORDER(RESULT)
append ITEM to RESULT
end function
In this section, we discussed more terminology related to trees as well as tree traversals. To recap the new vocabulary:
Ancestor
- The ancestors of a node are those reached from child to parent relationships. We can think of this as our parents and the parents of our parents, and so on.Depth
- The depth of a node is its distance to the root. Thus, the root has depth zero. Level
and depth
are related in that: level = 1 + depth
.Descendant
- The descendants of a node are those reached from parent to child relationships. We can think of this as our children and our children’s children and so on.Height of a Node
- The height of a node is the longest path to a leaf descendant. The height of a leaf is zero.Height of a Tree
- The height of a tree is equal to the height of the root.Level
- The level of a node characterizes the distance the node is from the root. The root of the tree is considered level 1. As you move away from the tree, the level increases by one.Path
- a sequence of nodes and edges which connect a node with its descendant.Siblings
- Nodes which share the same parentTraversal
is a general term we use to describe going through a tree. The following traversals are defined recursively.
This page is the main page for Tries
Recall that in the beginning of our discussions about trees, we looked at a small tree which contained seven strings as motivation for trees. This was a small example of a trie (pronounced ’try’) which is a type of tree that can represent sets of words.
Tries can be used for a variety of tasks ranging from leisurely games to accessibility applications. One example is ‘Boggle’ where players have a set of random letters and try to make as many words as possible. To code this game, we could create a vocabulary with a trie then traverse it to determine if players have played legal words. We can also use tries to provide better typing accessibility. Users could type a few letters of a word and our code could traverse the trie and suggest what letters or words they may be trying to enter.
A trie is a type of tree with some special characteristics. First it must follow the guidelines of being a tree:
The special characteristics for tries are:
In this course, we will display nodes with two circles as a convention to show which nodes are the end of words. Looking at this small trie as an example, we can determine which words are contained in our trie.
We start at the root, which will typically be an empty string, and traverse to a double lined node. "" -> a -> l -> l
. Thus, the word ‘all’ is contained in our trie. Words within our tries do not have to end at leaves. For example, we can traverse "" -> a
for the word ‘a’. We say this trie ‘contains’ seven words: ‘a’, ‘an’, ‘and’, ‘ant’, ‘any’, ‘all’, and ‘alp’.
Let’s look at another example of a trie. Here we have a larger trie. Think about how many words are captured by the tree; click the tree to see how many!
While the ‘a’, ‘at’, and ‘bee’ are words in the English language, they are not recognized by our trie. Depending on what the user intended, this could be by design. When we build our tries, users will input words that are valid for their vocabulary. Tries are not limited to the English language and can be created for any vocabulary.
To implement our own trie, we will build off of MyTree that we built recursively. We will add an attribute to our tree to reinforce which nodes are words and which ones are not.
We have the existing attributes of MyTree: parent, children, and item. For MyTrie, we introduce the boolean attribute is_word
to delineate if our trie is a word.
To add a word to our trie, we traverse through the trie letter by letter. We can define this recursively.
function ADDWORD(WORD)
if WORD length is 0
if already a word
return false
else
set is_word to true
return true
else
FIRST = first character of WORD
REMAIN = remainder of WORD
CHILD = FINDCHILD(FIRST)
if CHILD is NONE
NODE = new MyTrie with item equal FIRST
insert NODE into our existing trie
CHILD = NODE
return CHILD.ADDWORD(REMAIN)
end function
Similar to adding a word, we traverse our trie letter by letter. Once we get to the end of the word set is_word
to false. If the word ends at a leaf, we will remove the leaf (then if the second to last character is a leaf, we remove the leaf and so on). If the word does not end in a leaf, meaning another word uses that node, we will not remove the node.
is_word
is false and it is a leaf, remove the node.function REMOVEWORD(WORD)
if WORD length is 0
if already not a word
return false
else
set is_word to false
return true
else
FIRST = first character of WORD
REMAIN = remainder of WORD
CHILD = FINDCHILD(FIRST)
if CHILD is NONE
return false
else
RET = CHILD.REMOVEWORD(REMAIN)
if CHILD is not a word AND CHILD is a leaf
REMOVECHILD(CHILD)
return RET
end function
Again, we will traverse the trie letter by letter. Once we get to the last letter, we can return that nodes is_word
attribute. There is a chance that somewhere in our word, the letter is not a child of the previous node. If that is the case, then we return false.
function CONTAINSWORD(WORD)
if WORD length is 0
return `is_word`
else
FIRST = first character of WORD
REMAIN = remainder of WORD
CHILD = FINDCHILD(FIRST)
if CHILD is NONE
return false
else
return CHILD.CONTAINSWORD(REMAIN)
end function
For this function, we want to get the total number of words that are contained within our trie. We will fan out through all of the children and count all of the nodes that have their is_word
attribute equal to true.
function WORDCOUNT()
COUNT = 0
if is_word
COUNT = 1
for CHILD in CHILDREN
COUNT += CHILD.WORDCOUNT()
return COUNT
end function
Next, we want to get the longest word contained in our trie. To do this, we will recurse each child and find the maximum length of the child.
function MAXWORD()
if LEAF and is_word
return 0
else
MAX = -1
for CHILD in CHILDREN
COUNT = CHILD.MAXWORD()
if COUNT greater than MAX
MAX = COUNT
return MAX + 1
end function
This function will act as an auto-complete utility of sorts. A user will input a string of characters and we will return all of the possible words that are contained in our trie. This will happen in two phases. First, we traverse the trie to get to the end of the input string (lines 1-12). The second portion then gets all of the words that are contained after that point in our trie (lines 14-21).
function COMPLETIONS(WORD)
1. if WORD length greater than 0
2. FIRST = first character of WORD
3. REMAIN = remainder of WORD
4. CHILD = FINDCHILD(FIRST)
5. if CHILD is none
6. return []
7. else
8. COMPLETES = CHILD.COMPLETIONS(REMAIN)
9. OUTPUT = []
10. for COM in COMPLETES
11. append CHILD.item + COM to OUTPUT
12. return OUTPUT
13. else
14. OUTPUT = []
15. if is_word
16. append ITEM to OUTPUT
17. for CHILD in CHILDREN
18. COMPLETES = CHILD.COMPLETIONS("")
19. for COM in COMPLETES
20. append CHILD.item + COM to OUTPUT
21. reutrn OUTPUT
end function
This page is the main page for Binary Trees
A binary tree is a type of tree with some special conditions. First, it must follow the guidelines of being a tree:
The special conditions that we impose on binary trees are the following:
To reinforce these concepts, we will look at examples of binary trees and examples that are not binary trees.
This is a valid binary tree. We have a single node, the root, with no children. As with general trees, binary trees are built recursively. Thus, each node and its child(ren) are trees themselves.
This is also a valid binary tree. All of the left children are less than their parent. The node with item ‘10’ is also in the correct position as it is less than 12, 13, and 14 but greater than 9.
We have the same nodes but our root is now 12 whereas before it was 14. This is also a valid binary tree.
Here we have an example of a binary tree with alphabetical items. As long as we have items which have a predefined order, we can organize them using a binary tree.
We may be inclined to say that this is a binary tree: each node has 0, 1, or 2 children and amongst children and parent nodes, the left child is smaller than the parent and the right child is greater than the parent. However, in binary trees, all of the nodes in the left tree must be smaller than the root and all of the nodes in the right tree must be larger than the root. In this tree, D
is out of place. Node D
is less than node T
but it is also less than node Q
. Thus, node D
must be on the right of node Q
.
In this case, we do not have a binary tree. This does fit all of the criteria for being a tree but not the criteria for a binary tree. Nodes in binary trees can have at most 2 children. Node 30
has three children.
In the first module we discussed two types of traversals: preorder and postorder. Within that discussion, we noted that for general trees, the preorder and postorder traversal may not be unique. This was due to the fact that children nodes are an unordered set.
We are now working with binary trees which have a defined child order. As a result, the preorder and postorder traversals will be unique! These means that for a binary tree when we do a preorder traversal there is exactly one string that is possible. The same applies for postorder traversals as well.
Recall that these were defined as such:
Now for binary trees, we can modify their definitions to be more explicit:
Let’s practice traversals on the following binary tree.
Since we have fixed order on the children, we can introduce another type of traversal: in-order traversal.
In-order Traversal:
Our implementation of binary trees will inherit from our MyTree implementation as binary trees are types of trees. Thus, MyBinaryTree will have the functionality of MyTree in addition to the following.
The binary tree has two attributes
MyBinaryTree
, the item should be less than the item of the parent.MyBinaryTree
, the item should be greater than the item of the parent.Get Size
MyTree
size function. If the tree is empty then we return zero. If the tree is not empty then call the MyTree
size function.Is Empty
To Sorted List
function TOSORTEDLIST()
LIST = []
if there`s LEFTCHILD
LIST = LIST + LEFTCHILD.TOSORTEDLIST
LIST = LIST + ITEM
if there`s RIGHTCHILD
LIST = LIST + RIGHTCHILD.TOSORTEDLIST
return LIST
When inserting children to a binary tree, we must take some special considerations. All of the node items in the left tree must be less than the parent node item and all of the node items in the right tree must be greater than the parent node item.
The general procedure for adding a child is the following:
Suppose that we have the following tree and we want to add a node with item ‘85’. Click the binary tree to see the resulting tree.
function INSERT(VALUE)
if node is empty:
set nodes item to value
else:
if node.ITEM is VALUE
return false
else if node.ITEM > VALUE
LC = node`s left child
if LC is NONE
CHILD = new BINARYTREE with root.ITEM equal VALUE
add CHILD to nodes children
set node.LEFTCHILD equal to CHILD
return true
else
return LC.INSERT(VALUE)
else
RC = node`s right child
if RC is NONE
CHILD = new BINARYTREE with root.ITEM equal VALUE
add CHILD to nodes children
set node.RIGHTCHILD equal to CHILD
return true
else
return RC.INSERT(VALUE)
end function
Removing children is not as straightforward as inserting them. The general procedure for removing a child is to replace that nodes value with its smallest right descendant. First we will traverse the binary tree until we find the node with the value we are trying to remove (lines 18-32 below). Then we have three separate cases, discussed in detail below.
Removing a leaf is the most straightforward. We remove the value from the node and then sever the connection between parent and child. (lines 5-7 below)
Suppose we have this binary tree and we want to remove value 5
. What do you think the resulting binary tree will look like? Click the binary tree to see the result.
When we remove a value from a node that does not have a right child, we cannot replace the value with the smallest right child. In this instance we will instead replace the value with the smallest left child then prune the tree to clean it up. Once we replace the value, we must switch the node’s left child to be the right child in order to maintain proper binary tree structure. (lines 8-13 below)
Suppose we have this binary tree and we want to remove value 4
. What do you think the resulting binary tree will look like? Click the binary tree to see the result.
When we remove a value from a node that has a right child, we can replace the value with the nodes smallest right child. (Lines 14-17 Below)
Suppose we have this binary tree and we want to remove value 10
. What do you think the resulting binary tree will look like? Click the binary tree to see the result.
1. function REMOVE(VALUE)
2. if node is empty:
3. error
4. if node.ITEM is VALUE
5. if node is a leaf
6. set node.ITEM to none
7. return TRUE
8. else if node has no right child
9. node.ITEM = LEFTCHILD.REMOVESMALLEST()
10. prune left-side
11. store left child in right child
12. set left child to none
13. return TRUE
14. else
15. node.ITEM = RIGHTCHILD.REMOVESMALLEST()
16. prune right-side
17. return TRUE
18. else
19. if node.ITEM > VALUE
20. if node has LEFTCHILD
21. SUCCESS = LEFTCHILD.REMOVE(VALUE)
22. prune left-side
23. return SUCCESS
24. else
25. return FALSE
26. else
27. if node has RIGHTCHILD
28. SUCCESS = RIGHTCHILD.REMOVE(VALUE)
29. prune right-side
30. return SUCCESS
31. else
32. return FALSE
33. end function
We use the pruning functions to severe the tie between parent and child nodes.
function PRUNERIGHT()
if RIGHTCHILD has no value
REMOVECHILD(RIGHTCHILD)
set this nodes RIGHTCHILD former RIGHTCHILDs RIGHTCHILD
if RIGHTCHLID is not none
ADDCHILD(RIGHTCHILD)
end function
function PRUNELEFT()
if LEFTCHILD has no value
REMOVECHILD(LEFTCHILD)
set this nodes LEFTCHILD former LEFTCHILDs RIGHTCHILD
if LEFTCHILD is not none
ADDCHILD(LEFTCHILD)
end function
We use the remove smallest function to retrieve the smallest value in the binary tree which will replace our value.
function REMOVESMALLEST()
if node has left child
REPLACEMENT = LEFTCHILD.REMOVESMALLEST
prune left-side
return REPLACEMENT
else
REPLACEMENT = node.ITEM
if node has right child
node.ITEM = RIGHTCHILD.REMOVESMALLEST()
prune right-side
else
node.ITEM = NONE
return REPLACEMENT
end function
While this is a valid binary tree, it is not balanced. Let’s look at the following tree.
We have the same nodes but our root is now 12 whereas before it was 14. This is a valid binary tree. We call this a balanced
binary tree. A balanced binary tree looks visually even amongst the left and right trees in terms of number of nodes.
Note: Balancing is not necessary for a valid binary tree. It is, however, important in terms of time efficiency to have a balanced tree. For example, the number of actions when inserting an element is about the same as the number of levels in the tree. If we tried to add the value 11 into the unbalanced tree, we would traverse 5 nodes. If we tried to add the value 11 in to the balanced tree, we would traverse just 3 nodes.
We believe that balancing binary trees is out of the scope of this course. If you are interested in how we might balance a tree, feel free to check out these videos by Dr. Joshua Weese.
YouTube Video YouTube Video