EID 151: Programming Languages

Spring 2004

Monday 3:00 - 5:00 Rm 521, Thursday 4:00 - 5:00 Rm 621

Click here for main page of course...

Topic #6: Binary Trees

Binary trees, liked linked lists, combine pointers, structures, and dynamic memory allocation to store data when it is unknown by both the programmer and user how much data there will be ahead of time. Each node in a binary tree has two links, one to a left child and one to a right child. The binary tree has a single root, and somewhere the program must store a pointer to the root; this pointer will likely be the only thing stored in the static area of memory or an activation record (depending on whether it is a global or a local); the rest of the memory will come from the heap (allocated with malloc or some other provided function) as is the case with a linked list. A binary tree can be used instead of a linked list whenever it is OK to order nodes based on their value (e.g. sorting of words or numeric data). They can not be used when there is a specific sequential order of nodes that can not be changed (e.g. the digits of a number as in the multiplication example of the last topic).

A linked list generally takes linear time to insert a node or search for a node, and it takes linear time to walk through all the nodes in a linked list. A binary tree takes, on average, only logarithmic time (with a base of 2) to insert a node or search for a node; it still takes linear time to traverse all the nodes in a binary tree in order. This is a major improvement; for example, a sort implemented by inserting nodes into a linked list in order takes O(N^2) time (the same as bubble sort, insertion sort, or selection sort). A sort implemented by inserting nodes into a binary tree takes only O(N * log N) time (the same as merge sort or quicksort). This is an AVERAGE case analysis, it is likely if the data comes in an arbitrary order. The worst case is when the data arrives in an already sorted (or reverse sorted) order, in which case a binary tree behaves the same as a linked list (and takes extra memory, although not a significant amount). There are improvements to binary trees that guarantees logarithmic time for inserts and searches, and if you go on to take a Data Structures and/or Algorithms course you will probably learn about this.

The typical way to order nodes in a binary is is as follows: The first node becomes the root no matter what. Whenever a new node is created, you start at the root, and if the value is less than the value at the root you move to the left child of the root, and if the data is greater than or equal to the root, you move to the right child. You repeat this process until you hit a NULL, and then, in the position of this NULL, you insert a new leaf (a leaf is a node without any children). It is also possible to store counts with each node so that repeated values do not lead to new nodes; this is the case with the word counting program, and in class we have seen an example of this program with both linked lists and binary trees.

Once a binary tree exists, it is surprisingly simple to traverse all of the nodes of a binary tree in order in linear time. Since you know that all nodes in the left subtree of a node have values less than the current node, and all nodes in the right subtree of a node have values greater than or equal to the current node, to traverse all nodes in order, starting at a given node, you simply traverse the nodes of the left subtree in order, then visit the current node, then traverse the right subtree in order. This leads to a simple, recursive inorder traversal routine that can be implemented with a few lines of code.