University of Phoenix - DAT/305 - Individual: TreesResource: Ch. 6, "Trees", of Data Structures: Abstraction and Design Using Java, Exercises for Section 6.1; Self-Check #3Complete the Self-Check Ques

Chapter 6

Trees

Chapter Objectives

To learn how to use a tree to represent a hierarchical organization of information

To learn how to use recursion to process trees

To understand the different ways of traversing a tree

To understand the difference between binary trees, binary search trees, and heaps

To learn how to implement binary trees, binary search trees, and heaps using linked data structures and arrays

To learn how to use Java 8 Lambda Expressions and Functional Interfaces to simplify coding

To learn how to use a binary search tree to store information so that it can be retrieved in an efficient manner

To learn how to use a Huffman tree to encode characters using fewer bits than ASCII or Unicode, resulting in smaller files and reduced storage requirements

The data organizations you have studied so far are linear in that each element has only one predecessor or successor. Accessing all the elements in sequence is an O(n) process. In this chapter, we begin our discussion of a data organization that is nonlinear or hierarchical: the tree. Instead of having just one successor, a node in a tree can have multiple successors, but it has just one predecessor. A tree in computer science is like a natural tree, which has a single trunk that may split off into two or more main branches. The predecessor of each main branch is the trunk. Each main branch may spawn several secondary branches (successors of the main branches). The predecessor of each secondary branch is a main branch. In computer science, we draw a tree from the top down, so the root of the tree is at the top of the diagram instead of the bottom.

Because trees have a hierarchical structure, we use them to represent hierarchical organizations of information, such as a class hierarchy, a disk directory and its subdirectories (see Figure 6.1), or a family tree. You will see that trees are recursive data structures because they can be defined recursively. For this reason, many of the methods used to process trees are written as recursive methods.

image

FIGURE 6.1 Part of the Programs Directory

This chapter will focus on a restricted tree structure, a binary tree, in which each element has, at most, two successors. You will learn how to use linked data structures and arrays to represent binary trees. You will also learn how to use a special kind of binary tree called a binary search tree to store information (e.g., the words in a dictionary) in an ordered way. Because each element of a binary tree can have two successors, you will see that searching for an item stored in a binary search tree is much more efficient than searching for an item in a linear data structure: (generally O(log n) for a binary tree versus O(n) for a list).

You also will learn about other kinds of binary trees. Expression trees are used to represent arithmetic expressions. The heap is an ordered tree structure that is used as the basis for a very efficient sorting algorithm and for a special kind of queue called the priority queue. The Huffman tree is used for encoding information and compressing files.

Trees

6.1 Tree Terminology and Applications

6.2 Tree Traversals

6.3 Implementing a BinaryTree Class

6.4 Java 8 Lambda Expressions and Functional Interfaces

6.5 Binary Search Trees

Case Study: Writing an Index for a Term Paper

6.6 Heaps and Priority Queues

6.7 Huffman Trees

Case Study: Building a Custom Huffman Tree

6.1 Tree Terminology and Applications

Tree Terminology

We use the same terminology to describe trees in computer science as we do trees in nature. A computer science tree consists of a collection of elements or nodes, with each node linked to its successors. The node at the top of a tree is called its root because computer science trees grow from the top down. The links from a node to its successors are called branches. The successors of a node are called its children. The predecessor of a node is called its parent. Each node in a tree has exactly one parent except for the root node, which has no parent. Nodes that have the same parent are siblings. A node that has no children is a leaf node. Leaf nodes are also known as external nodes, and nonleaf nodes are known as internal nodes.

A generalization of the parent–child relationship is the ancestor–descendant relationship. If node A is the parent of node B, which is the parent of node C, node A is node C's ancestor, and node C is node A's descendant. Sometimes we say that node A and node C are a grandparent and grandchild, respectively. The root node is an ancestor of every other node in a tree, and every other node in a tree is a descendant of the root node.

Figure 6.2 illustrates these features in a tree that stores a collection of words. The branches are the lines connecting a parent to its children. In discussing this tree, we will refer to a node by the string that it stores. For example, we will refer to the node that stores the string "dog" as node dog.

image

FIGURE 6.2 A Tree of Words

A subtree of a node is a tree whose root is a child of that node. For example, the nodes cat and canine and the branch connecting them are a subtree of node dog. The other subtree of node dog is the tree consisting of the single node wolf. The subtree consisting of the single node canine is a subtree of node cat.

The level of a node is a measure of its distance from the root. It is defined recursively as follows:

If node n is the root of tree T, its level is 1.

If node n is not the root of tree T, its level is 1 + the level of its parent.

For the tree in Figure 6.2, node dog is at level 1, nodes cat and wolf are at level 2, and node canine is at level 3. Since nodes are below the root, we sometimes use the term depth as an alternative term for level. The two have the same meaning.

The height of a tree is the number of nodes in the longest path from the root node to a leaf node. The height of the tree in Figure 6.2 is 3 (the longest path goes through the nodes dog, cat, and canine). Another way of saying this is as follows:

If T is empty, its height is 0.

If T is not empty, its height is the maximum depth of its nodes.

An alternate definition of the height of a tree is the number of branches in the longest path from the root node to a leaf node +1.

Binary Trees

The tree in Figure 6.2 is a binary tree. Informally, this is a binary tree because each node has at most two subtrees. A more formal definition for a binary tree follows.

A set of nodes T is a binary tree if either of the following is true:

T is empty.

If T is not empty, its root node has two subtrees, TL and TR, such that TL and TR are binary trees.

We refer to TL as the left subtree and TR as the right subtree. For the tree in Figure 6.2, the right subtree of node cat is empty. The leaf nodes (wolf and canine) have empty left and right subtrees. This is illustrated in Figure 6.3, where the empty subtrees are indicated by the squares. Generally, the empty subtrees are represented by null references, but another value may be chosen. From now on, we will consistently use a null reference and will not draw the squares for the empty subtrees.

image

FIGURE 6.3 A Tree of Words with Null Subtrees Indicated

Some Types of Binary Trees

Next, we discuss three different types of binary trees that are common in computer science.

An Expression Tree

Figure 6.4 shows a binary tree that stores an expression. Each node contains an operator (+, -, *, /, %) or an operand. The expression in Figure 6.4 corresponds to (x + y) * ((a + b) / c). Operands are stored in leaf nodes. Parentheses are not stored in the tree because the tree structure dictates the order of operator evaluation. Operators in nodes at higher levels are evaluated after operators in nodes at lower levels, so the operator * in the root node is evaluated last. If a node contains a binary operator, its left subtree represents the operator's left operand and its right subtree represents the operator's right operand. The left subtree of the root represents the expression x + y, and the right subtree of the root represents the expression (a + b) / c.

image

FIGURE 6.4 Expression Tree

A Huffman Tree

Another use of a binary tree is to represent Huffman codes for characters that might appear in a text file. Unlike ASCII or Unicode encoding, which use the same number of bits to encode each character, a Huffman code uses different numbers of bits to encode the letters. It uses fewer bits for the more common letters (e.g., space, e, a, and t) and more bits for the less common letters (e.g., q, x, and z). On average, using Huffman codes to encode text files should give you files with fewer bits than you would get using other codes. Many programs that compress files use Huffman encoding to generate smaller files in order to save disk space or to reduce the time spent sending the files over the Internet.

Figure 6.5 shows the Huffman encoding tree for an alphabet consisting of the lowercase letters and the space character. All the characters are at leaf nodes. The data stored at nonleaf nodes is not shown. To determine the code for a letter, you form a binary string by tracing the path from the root node to that letter. Each time you go left, append a 0, and each time you go right, append a 1. To reach the space character, you go right three times, so the code is 111. The code for the letter d is 10110 (right, left, right, right, left).

image

FIGURE 6.5 Huffman Code Tree

The two characters shown at level 4 of the tree (space, e) are the most common and, therefore, have the shortest codes (111, 010). The next most common characters (a, o, i, etc.) are at level 5 of the tree.

You can store the code for each letter in an array. For example, the code for the space ' ' would be at position 0, the letter 'a' would be at position 1, and the code for letter 'z' would be at position 26. You can encode each letter in a file by looking up its code in the array.

However, to decode a file of letters and spaces, you walk down the Huffman tree, starting at the root, until you reach a letter and then append that letter to the output text. Once you have reached a letter, go back to the root. Here is an example. The substrings that represent the individual letters are shown in alternate shades of black to help you follow the process. The underscore in the second line represents a space character (code is 111).

10001010011110101010100010101110100011

g o _ e a g l e s

Huffman trees are discussed further in Section 6.7.

A Binary Search Tree

The tree in Figure 6.2 is a binary search tree because, for each node, all words in its left subtree precede the word in that node, and all words in its right subtree follow the word in that node. For example, for the root node dog, all words in its left subtree (cat, canine) precede dog in the dictionary, and all words in its right subtree (wolf) follow dog. Similarly, for the node cat, the word in its left subtree (canine) precedes it. There are no duplicate entries in a binary search tree.

More formally, we define a binary search tree as follows:

A set of nodes T is a binary search tree if either of the following is true:

T is empty.

If T is not empty, its root node has two subtrees, TL and TR, such that TL and TR are binary search trees and the value in the root node of T is greater than all values in TL and is less than all values in TR.

The order relations in a binary search tree expedite searching the tree. A recursive algorithm for searching a binary search tree follows:

1. if the tree is empty

2. Return null (target is not found).

else if the target matches the root node’s data

3. Return the data stored at the root node.

else if the target is less than the root node’s data

4. Return the result of searching the left subtree of the root.

else

5. Return the result of searching the right subtree of the root.

The first two cases are base cases and self-explanatory. In the first recursive case, if the target is less than the root node's data, we search only the left subtree (TL) because all data items in TR are larger than the root node's data and, therefore, larger than the target. Likewise, we execute the second recursive step (search the right subtree) if the target is greater than the root node's data.

Just as with a binary search of an array, each probe into the binary search tree has the potential of eliminating half the elements in the tree. If the binary search tree is relatively balanced (i.e., the depths of the leaves are approximately the same), searching a binary search tree is an O(log n) process, just like a binary search of an ordered array.

What is the advantage of using a binary search tree instead of just storing elements in an array and then sorting it? A binary search tree never has to be sorted because its elements always satisfy the required order relations. When new elements are inserted (or removed), the binary search tree property can be maintained. In contrast, an array must be expanded whenever new elements are added, and it must be compacted whenever elements are removed. Both expanding and contracting involve shifting items and are thus O(n) operations.

Full, Perfect, and Complete Binary Trees

The tree on the left in Figure. 6.6 is called a full binary tree because all nodes have either 2 children or 0 children (the leaf nodes). The tree in the middle is a perfect binary tree, which is defined as a full binary tree of height n(n is 3) with exactly 2n − 1 (7) nodes. The tree on the right is a complete binary tree, which is a perfect binary tree through level n − 1 with some extra leaf nodes at level n (the tree height), all toward the left.

image

FIGURE 6.6 Full, Perfect, and Complete Binary Trees

General Trees

A general tree is a tree that does not have the restriction that each node of a tree has at most two subtrees. So nodes in a general tree can have any number of subtrees. Figure 6.7 shows a general tree that represents a family tree showing the descendants of King William I (the Conqueror) of England.

image

FIGURE 6.7 Family Tree for the Descendants of William I of England

We will not discuss general trees in this chapter. However, it is worth mentioning that a general tree can be represented using a binary tree. Figure 6.8 shows a binary tree representation of the family tree in Figure 6.7. We obtained it by connecting the left branch from a node to the oldest child (if any). Each right branch from a node is connected to the next younger sibling (if any).

image

FIGURE 6.8 Binary Tree Equivalent of King William's Family Tree

The names of the men who became kings are in boldface type. You would expect the eldest son to succeed his father as king; however, this would not be the case if the eldest male died before his father. For example, Robert died before William I, so William II became king instead. Starting with King John (near the bottom of the tree), the eldest son of each king did become the King of England.

EXERCISES FOR SECTION 6.1

SELF-CHECK

1. Draw binary expression trees for the following infix expressions. Your trees should enforce the Java rules for operator evaluation (higher-precedence operators before lower-precedence operators and left associativity).

x / y + a – b * c

(x * a) – y / b * (c + d)

(x + (a * (b – c))/d

2. Using the Huffman tree in Figure 6.5,

Write the binary string for the message “scissors cuts paper”.

Decode the following binary string using the tree in Figure 6.5:

1100010001010001001011101100011111110001101010111101101001

3. For each tree shown below, answer these questions. What is its height? Is it a full tree? Is it a complete tree? Is it a binary search tree? If not, make it a binary search tree.

University of Phoenix - DAT/305 - Individual: TreesResource: Ch. 6, "Trees", of Data Structures: Abstraction and Design Using Java, Exercises for Section 6.1; Self-Check #3Complete the Self-Check Ques 1

4. For the binary trees in Figures 6.2–6.5, indicate whether each tree is full, perfect, complete, or none of the above.

5. Represent the general tree in Figure 6.1 as a binary tree.

6.2 Tree Traversals

Often we want to determine the nodes of a tree and their relationship. We can do this by walking through the tree in a prescribed order and visiting the nodes (processing the information in the nodes) as they are encountered. This process is known as tree traversal. We will discuss three kinds of traversal in this section: inorder, preorder, and postorder. These three methods are characterized by when they visit a node in relation to the nodes in its subtrees (TL and TR).

Preorder: Visit root node, traverse TL, and traverse TR.

Inorder: Traverse TL, visit root node, and traverse TR.

Postorder Traverse TL, traverse TR, and visit root node.

Because trees are recursive data structures, we can write similar recursive algorithms for all three techniques. The difference in the algorithms is whether the root is visited before the children are traversed (pre), in between traversing the left and right children (in), or after the children are traversed (post).

Algorithm for Preorder Traversal

1. if the tree is empty

2. Return.

else

3. Visit the root.

4. Preorder traverse the left subtree.

5. Preorder traverse the right subtree.

Algorithm for Inorder Traversal

1. if the tree is empty

2. Return.

else

3. Inorder traverse the left subtree.

4. Visit the root.

5. Inorder traverse the right subtree.

Algorithm for Postorder Traversal

1. if the tree is empty

2. Return.

else

3. Postorder traverse the left subtree.

4. Postorder traverse the right subtree.

5. Visit the root.

Visualizing Tree Traversals

You can visualize a tree traversal by imagining a mouse that walks along the edge of the tree. If the mouse always keeps the tree to the left (from the mouse's point of view), it will trace the route shown in gray around the tree shown in Figure 6.9. This is known as an Euler tour.

image

FIGURE 6.9 Traversal of a Binary Tree

If we record each node as the mouse first encounters it (indicated by the arrows pointing down in Figure 6.9), we get the following sequence:

a b d g e h c f i j

This is a preorder traversal because the mouse visits each node before traversing its subtrees. The mouse also walks down the left branch (if it exists) of each node before going down the right branch, so the mouse visits a node, traverses its left subtree, and traverses its right subtree.

If we record each node as the mouse returns from traversing its left subtree (indicated by the arrows pointing to the right in Figure 6.9), we get the following sequence:

d g b h e a i f j c

This is an inorder traversal. The mouse traverses the left subtree, visits the root, and then traverses the right subtree. Node d is visited first because it has no left subtree.

If we record each node as the mouse last encounters it (indicated by the arrows pointing up in Figure 6.9), we get the following sequence:

g d h e b i j f c a

This is a postorder traversal because we visit the node after traversing both its subtrees. The mouse traverses the left subtree, traverses the right subtree, and then visits the node.

Traversals of Binary Search Trees and Expression Trees

An inorder traversal of a binary search tree results in the nodes being visited in sequence by increasing data value. For example, for the binary search tree shown earlier in Figure 6.2, the inorder traversal would visit the nodes in the sequence:

canine, cat, dog, wolf

Traversals of expression trees give interesting results. If we perform an inorder traversal of the expression tree first shown in Figure 6.4 and repeated here, we visit the nodes in the sequence x + y* a + b / c. If we insert parentheses where they belong, we get the infix expression

(x + y) * ((a + b) / c)

image

The postorder traversal of this tree would visit the nodes in the sequence

image

which is the postfix form of the expression. To illustrate this, we show the operand–operand–operator groupings under the expression.

The preorder traversal visits the nodes in the sequence

image

which is the prefix form of the expression. To illustrate this, we show the operator–operand–operand groupings under the expression.

EXERCISES FOR SECTION 6.2

SELF-CHECK

For the following trees:

image

If visiting a node displays the integer value stored, show the inorder, preorder, and postorder traversal of each tree.

Repeat Exercise 1 above for the trees in Figure 6.6, redrawn below.

image

Draw an expression tree corresponding to each of the following:

Inorder traversal is x / y + 3 * b / c (Your tree should represent the Java meaning of the expression.)

Postorder traversal is x y z + a b – c * / –

Preorder traversal is * + a – x y / c d

Explain why the statement “Your tree should represent the Java meaning of the expression” was not needed for parts b and c of Exercise 3 above.

6.3 Implementing a BinaryTree Class

In this section, we show how to use linked data structures to represent binary trees and binary tree nodes. We begin by focusing on the structure of a binary tree node.

The Node<E> Class

Just as for a linked list, a node consists of a data part and links (references) to successor nodes. So that we can store any kind of data in a tree node, we will make the data part a reference of type E. Instead of having a single link (reference) to a successor node as in a list, a binary tree node must have links (references) to both its left and right subtrees. Figure 6.10 shows the structure of a binary tree node; Listing 6.1 shows its implementation.

image

FIGURE 6.10 Linked Structure to Represent a Node

LISTING 6.1

Nested Class Node

/** Class to encapsulate a tree node. */

protected static class Node<E> implements Serializable {

// Data Fields

/** The information stored in this node. */

protected E data;

/** Reference to the left child. */

protected Node<E> left;

/** Reference to the right child. */

protected Node<E> right;

// Constructors

/** Construct a node with given data and no children.

@param data The data to store in this node

public Node(E data) {

this.data = data;

left = null;

right = null;

}

// Methods

/** Return a string representation of the node.

@return A string representation of the data fields

public String toString () {

return data.toString();

}

Class Node<E> is nested within class BinaryTree<E>. Note that it is declared protected and its data fields are all protected. Later, we will use the BinaryTree<E> and Node<E> classes as superclasses. By declaring the nested Node<E> class and its data fields protected, we make them accessible in the subclasses of BinaryTree<E> and Node<E>.

The constructor for class Node<E> creates a leaf node (both left and right are null). The toString method for the class just displays the data part of the node.

Both the BinaryTree<E> class and the Node<E> class are declared to implement the Serializable interface. The Serializable interface defines no methods; it is used to provide a marker for classes that can be written to a binary file using the ObjectOutputStream and read using the ObjectInputStream. We clarify what this means later in the section.

The BinaryTree<E> Class

Table 6.1 shows the design of the BinaryTree<E> class. The single data field root references the root node of a BinaryTree<E> object. It has protected visibility because we will need to access it in subclass BinarySearchTree, discussed later in this chapter. In Figure 6.11, we draw the expression tree for ((x + y) * (a / b)) using our Node representation. Each character shown as tree data would be stored in a Character object.

TABLE 6.1 Design of the BinaryTree<E> Class

Data Field Attribute

protected Node<E> root Reference to the root of the tree

Constructor Behavior

public BinaryTree() Constructs an empty binary tree

protected BinaryTree(Node<E> root) Constructs a binary tree with the given node as the root

public BinaryTree(E data, BinaryTree<E> leftTree, BinaryTree<E> rightTree) Constructs a binary tree with the given data at the root and the two given subtrees

Method Behavior

public BinaryTree<E> getLeftSubtree() Returns the left subtree

public BinaryTree<E> getRightSubtree() Returns the right subtree

public E getData() Returns the data in the root

public boolean isLeaf() Returns true if this tree is a leaf, false otherwise

public String toString() Returns a String representation of the tree

public void preOrderTraverse (BiConsumer<E, Integer> consumer) Performs a preorder traversal of the tree. Each node and its depth are passed to the consumer function

public static BinaryTree<E> readBinaryTree(Scanner scan) Constructs a binary tree by reading its data using Scanner scan

image

FIGURE 6.11 Linked Representation of Expression Tree ((x + y) * (a / b))

EXAMPLE 6.1

Assume the tree drawn in Figure 6.11 is referenced by variable bT (type BinaryTree).

bT.root.data references the Character object storing '*'.

bT.root.left references the left subtree of the root (the root node of tree x + y).

bT.root.right references the right subtree of the root (the root node of tree a / b).

bT.root.right.data references the Character object storing '/'.

The class heading and data field declarations follow:

import java.io.*;

/** Class for a binary tree that stores type E objects. */

public class BinaryTree<E> implements Serializable {

// Insert inner class Node<E> here.

// Data Field

/** The root of the binary tree */

protected Node<E> root;

. . .

The Constructors

There are three constructors: a no-parameter constructor, a constructor that creates a tree with a given node as its root, and a constructor that builds a tree from a data value and two trees.

The no-parameter constructor merely sets the data field root to null.

public BinaryTree() {

root = null;

}

The constructor that takes a Node as a parameter is a protected constructor. This is because client classes do not know about the Node class. This constructor can be used only by methods internal to the BinaryTree class and its subclasses.

protected BinaryTree(Node<E> root) {

this.root = root;

}

The third constructor takes three parameters: data to be referenced by the root node and two BinaryTrees that will become its left and right subtrees.

/** Constructs a new binary tree with data in its root leftTree

as its left subtree and rightTree as its right subtree.

public BinaryTree(E data, BinaryTree<E> leftTree,

BinaryTree<E> rightTree) {

root = new Node<>(data);

if (leftTree != null) {

root.left = leftTree.root;

} else {

root.left = null;

}

if (rightTree != null) {

root.right = rightTree.root;

} else {

root.right = null;

}

If leftTree is not null, the statement

root.left = leftTree.root;

executes. After its execution, the root node of the tree referenced by leftTree (leftTree.root) is referenced by root.left, making leftTree the left subtree of the new root node. If lT and rT are type BinaryTree<Character> and lT.root references the root node of binary tree x + y and rT.root references the root node of binary tree a / b, the statement

BinaryTree<Character> bT = new BinaryTree<> ('*', lT, rT);

would cause bT to reference the tree shown in Figure 6.12.

image

FIGURE 6.12 The Expression Tree (x + y) * (a / b)

The getLeftSubtree and getRightSubtree Methods

The getLeftSubtree method returns a binary tree whose root is the left subtree of the object on which the method is called. It uses the protected constructor discussed before to construct a new BinaryTree<E> object whose root references the left subtree of this tree. The getRightSubtree method is symmetric.

/** Return the left subtree.

@return The left subtree or null if either the root or

the left subtree is null

public BinaryTree<E> getLeftSubtree() {

if (root != null && root.left != null) {

return new BinaryTree<>(root.left);

} else {

return null;

}

The isLeaf Method

The isLeaf method tests to see whether this tree has any subtrees. If there are no subtrees, then true is returned.

/** Determine whether this tree is a leaf.

@return true if the root has no children

public boolean isLeaf() {

return (root.left == null && root.right == null);

}

The toString Method

The toString method generates a string representation of the BinaryTree for display purposes. The string representation is a preorder traversal in which each local root is indented a distance proportional to its depth. If a subtree is empty, the string "null" is displayed. The tree in Figure 6.12 would be displayed as follows:

null

The toString method creates a StringBuilder and then calls the recursive toString method (described next) passing the root and 1 (depth of root node) as arguments.

public String toString() {

StringBuilder sb = new StringBuilder();

toString(root, 1, sb);

return sb.toString();

}

The Recursive toString Method

This method follows the preorder traversal algorithm given in Section 6.2. It begins by appending a string of spaces proportional to the level so that all nodes at a particular level will be indented to the same point in the tree display. Then, if the node is null, the string "null\n" is appended to the StringBuilder. Otherwise the string representation of the node is appended to the StringBuilder and the method is recursively called on the left and right subtrees.

/** Converts a sub-tree to a string.

Performs a preorder traversal.

@param node The local root

@param depth The depth

@param sb The StringBuilder to save the output

private void toString(Node<E> node, int depth,

StringBuilder sb) {

for (int i = 1; i < depth; i++) {

sb.append(" ");

}

if (node == null) {

sb.append("null\n");

} else {

sb.append(node.toString());

sb.append("\n");

toString(node.left, depth + 1, sb);

toString(node.right, depth + 1, sb);

}

Reading a Binary Tree

If we use a Scanner to read the individual lines created by the toString method previously discussed, we can reconstruct the binary tree using the algorithm:

1. Read a line that represents information at the root.

2. Remove the leading and trailing spaces using the String.trim method.

3. if it is "null"

4. Return null.

else

5. Recursively read the left child.

6. Recursively read the right child.

7. Return a tree consisting of the root and the two children.

The tree that is constructed will be type BinaryTree<String>. The code for a method that implements this algorithm is shown in Listing 6.2.

LISTING 6.2

Method to Read a Binary Tree

/** Method to read a binary tree.

pre: The input consists of a preorder traversal

of the binary tree. The line "null" indicates a null tree.

@param scan the Scanner attached to the input file.

@return The binary tree

public static BinaryTree<String> readBinaryTree(Scanner scan) {

// Read a line and trim leading and trailing spaces.

String data = scan.nextLine().trim();

if (data.equals("null")) {

return null;

} else {

BinaryTree<String> leftTree = readBinaryTree(scan);

BinaryTree<String> rightTree = readBinaryTree(scan);

return new BinaryTree<>(data, leftTree, rightTree);

}

Using an ObjectOutputStream and ObjectInputStream

The Java API includes the class ObjectOutputStream that will write any object that is declared to be Serializable. You declare that an object is Serializable by adding the declaration

implements Serializable

to the class declaration. The Serializable interface (in java.io) contains no methods, but it serves to mark the class as being Serializable. This gives you control over whether or not you want your class to be written to an external file. Generally, you will want to have this capability.

To write an object of a Serializable class to a file, you do the following:

try {

ObjectOutputStream out =

new ObjectOutputStream(new FileOutputStream(nameOfFile));

out.writeObject(nameOfObject);

} catch (Exception ex) {

ex.printStackTrace();

System.exit(1);

}

The writeObject method performs a traversal of whatever data structure is referenced by the object being written. Thus, if the object is a binary tree, a deep copy of all of the nodes of the binary tree will be written to the file.

To read a Serializable class from a file, you do the following:

try {

ObjectInputStream in =

new ObjectInputStream(new FileInputStream(nameOfFile));

objectName = (objectClass) in.readObject();

} catch (Exception ex) {

ex.printStackTrace();

System.exit(1);

}

This code will reconstruct the object that was saved to the file, including any referenced objects. Thus, if a BinaryTree is written to an ObjectOutputStream, this method will read it back and restore it completely.

image PITFALL

Modifying the Class File of a Serialized Object

When an object is serialized, a unique class signature is recorded with the data. If you recompile the Java source file for the class to re-create the .class file, even though you did not make any changes, the resulting .class file will have a different class signature. When you attempt to read the object, you will get an exception.

EXERCISES FOR SECTION 6.3

SELF-CHECK

Draw the linked representation of the following two trees.

image

Show the tree that would be built by the following input string:

null

What can you say about this tree?

Write the strings that would be displayed for the two binary trees in Figure 6.6.

PROGRAMMING

Write a method for the BinaryTree class that returns the preorder traversal of a binary tree as a sequence of strings each separated by a space.

Write a method to display the postorder traversal of a binary tree in the same form as Programming Exercise 1.

Write a method to display the inorder traversal of a binary tree in the same form as Programming Exercise 1, except place a left parenthesis before each subtree and a right parenthesis after each subtree. Don't display anything for an empty subtree. For example, the expression tree shown in Figure 6.12 would be represented as (((x) + (y)) * ((a) / (b))).

6.4 Java 8 Lambda Expressions and Functional Interfaces

Java 8 introduces new features that enable functional programming. In functional programming, you can assign functions (methods) to variables or pass them as arguments to another function. The behavior of the function called with a function argument will vary, depending on its function argument. Assume you wrote a function called plot that cycled through angles from 0 to 360 ° in 5 ° increments and produced a graph showing a particular function value for each angle. If plot took a function argument, you could pass it a function such as sin or cosin. If you passed it sin, function plot would show a graph of sine values; if you passed it cosin, it would show a graph of cosine values.

Java can’t really pass methods as arguments, but it accomplishes the same thing using lambda expressions and functional interfaces. A lambda expression is a method without a declaration—sometimes called an anonymous method because it doesn’t have a name. A lambda expression is a shorthand notation that allows you to write a method where you are going to use it. This is useful when a method is going to be used only once and it saves you the effort of writing a separate method declaration. A lambda expression consists of a parameter list and either an expression or a statement block separated by the symbol −>.

EXAMPLE 6.2

The following lambda expression has no arguments as indicated by the empty parentheses. It simply displays the message shown in the println.

() -> System.out.println("That's all folks") // displays message

EXAMPLE 6.3

The next lambda expression has the value of m cubed. Because it has a single untyped argument m, the parentheses are omitted.

m -> m * m * m

EXAMPLE 6.4

The next lambda expression represents a method that returns the larger of its two arguments. Because the method body has multiple statements, it is enclosed in braces. The argument types could be omitted but the parentheses are required.

(int x, int y) -> {

if (x >= y)

return x;

return y;

}

We will see how to execute lambda expressions (anonymous methods) in the next section.

image SYNTAX Lambda Expression

FORM:

(parameter list) −> expression

(parameter list) −> {statement list }

EXAMPLE:

x -> x * x

(d, e) -> {

sb.append(d.toString());

sb.append(" : ");

sb.append(e.toString());

}

INTERPRETATION

The parameter list for the anonymous method has the same syntax as the parameter list for a method—a comma separated list of type identifier sequences enclosed in parentheses. If the compiler can infer the types, they can be omitted. If there is only one parameter, then the parentheses may also be omitted. An empty parameter list is denoted by (). The method body (following the ->) may be an expression or a statement block enclosed in curly braces.

Functional Interfaces

To cause a lambda expression to be executed, we must first assign it to a variable or pass it as an argument to another method. The data type of the variable being assigned to, or the corresponding parameter, must be a functional interface. A functional interface is an interface that declares exactly one abstract method. Java provides a set of functional interfaces, but you can also create your own.

EXAMPLE 6.5

The following is an example of a custom functional interface called MyFunction. It has a single abstract method apply that accepts two integer arguments and returns an integer result.

@FunctionalInterface

interface MyFunctionType {

public int apply(int x, int y);

}

Listing 6.3 shows a class that uses the lambda expression from Example 6.4 that returns the larger of its two arguments. The statement

MyFunctionType mFT = (x, y) -> {

if (x > y)

return x;

return y;

};

creates an object of an anonymous class type that implements interface MyFunctionType. The statement block to the right of -> implements method apply. Therefore, the statement

System.out.println("The larger number is : " + mFT.apply(m, n));

causes the statement block above to execute with m and n as its arguments. The larger of the two data values entered will be returned as the method result and displayed.

LISTING 6.3

Using Functional Interface MyFunctionType with a Lambda Expression

import java.util.Scanner;

@FunctionalInterface

interface MyFunctionType {

public int apply(int x, int y);

}

public class TestMyFunctionType {

public static void main(String[] args) {

Scanner sc = new Scanner(System.in);

System.out.println("Enter 2 integers: ");

int m = sc.nextInt();

int n = sc.nextInt();

MyFunctionType mFT = (x, y) -> {

if (x > y)

return x;

return y;

};

System.out.println("The larger number is : " + mFT.apply(m, n));

}

The Java developers created a set of functional interfaces in the java.util.function package. Table 6.2 lists some of these; there are many more. The type parameters T and U represent input types and R the result type. Each functional interface has a single abstract method; you need to check the Java documentation to see what it is.

TABLE 6.2 Selected Java Functional Interfaces

Interface Abstract Method Description

BiConsumer<T, U> void accept(T t, U u) Represents an operation that accepts two input arguments and returns no result

BiFunction<T, U, R> R apply(T t, U u) Represents an operation that accepts two arguments of different types and produces a result

BinaryOperator<T> T apply(T t1, T t2) Represents an operation upon two operands of the same type, producing a result of the operand type

Comparator<T> int compare(T t1, T t2) Represents an operation that accepts two arguments of the same type and returns a positive, zero, or negative result based on their ordering (negative if t1 < t2, zero if t1 equals t2, positive if t1 > t2)

Consumer<T> void accept(T t) Represents an operation that accepts one input argument and returns no result

Function<T, R> R apply(T t) Represents a function that accepts one argument and produces a result

Predicate<T> boolean test(T t) Represents a predicate (boolean-valued function) of one argument

Since our lambda expression in Listing 6.3 takes two arguments and returns a result of the same type, we can insert the statement

import java.util.function.BinaryOperator;

in our program and delete our custom interface. Object mFT should implement the BinaryOperator<T> interface (declared as type BinaryOperator<Integer> instead of MyFuntionType) and the method specified in the lambda expression will still implement abstract method apply. The revised declaration for function object mFT will start with BiConsumer<Integer, Integer> mFT = (x, y) -> {

Passing a Lambda Expression as an Argument

We started our discussion of lambda expressions by stating that we wanted to be able to write a function that could plot the values of another function passed to it. The plot would vary depending on the function argument. To accomplish this in Java, we need to pass a lambda expression as an argument to a method.

EXAMPLE 6.6

The following method displays the values of a function (its last argument) in the range represented by low to high in increments of step. The function represented by f takes an int argument and returns a double result.

/** Displays values associated with function f in the range specified.

* @param low the lower bound

* @param high the upper bound

* @param step the increment

* @param f the function object

public static void show(int low, int high, int step,

Function<Integer, Double> f) {

for (int i = low; i <= high; i += step) {

System.out.println(i + “ : “ + f.apply(i));

}

We can call function show using the statements

Function<Integer, Double> f;

f = angle -> Math.cos(Math.toRadians(angle));

show(0, 360, 30, f);

The first statement declares f to be an object of type Function<Integer, Double>. The second statement assigns f to a lambda expression with an integer argument angle. The method body implements abstract method Function.apply. Therefore, f.apply(angle) in show will calculate and return the value of the cosine after first converting an angle value to radians. The last statement calls show, passing it the range boundaries and increment and function f. The for loop body in method show will display a table of angle and cosine values for 0 ° through 360 °,

0 : 1.0

30 : 0.8660254037844387

60 : 0.5000000000000001

…

A more compact way of doing this would be to replace the three statements above that declare and initialize f and call show with

show(0, 360, 30, angle -> Math.cos(Math.toRadians(angle));

This statement passes the lambda expression directly to method show as an anonymous Function object.

A General Preorder Traversal Method

The recursive toString method in the previous section performs a preorder traversal to generate the string representation of a tree. There are other applications that require a preorder traversal. Thus, we want to separate the traversal from the action performed on each node. We will specify the action in a lambda interface.

At each node, we need to know the current node and its depth to perform an action, so we will use a functional interface that has parameter types Node<E> for the current node and Integer for the depth. Table 6.2 shows functional interface BiConsumer<T, U> that takes two arguments and has an abstract method accept whose return type is void. We will specify the action in a lambda expression that instantiates this interface. The preorder traversal starter method is passed a lambda expression with the action to be performed (referenced by consumer), and it calls the recursive preorder traversal passing it the root node, 1 for its level, and the object referenced by consumer.

/** Starter method for preorder traversal

* @param consumer an object that instantiates

* the BiConsumer interface. Its method implements

* abstract method apply.

public void preOrderTraverse(BiConsumer<E, Integer> consumer) {

preOrderTraverse(root, 1, consumer);

}

The private preOrderTraverse method performs the actual traversal.

/**

* Performs a recursive pre-order traversal of the tree,

* applying the action specified in the consumer object.

* @param consumer object whose accept method specifies

* the action to be performed on each node

private void preOrderTraverse(Node<E> node, int depth,

BiConsumer<E, Integer> consumer) {

if (node == null) {

consumer.accept(null, depth);

} else {

consumer.accept(node.data, depth);

preOrderTraverse(node.left, depth + 1, consumer);

preOrderTraverse(node.right, depth + 1, consumer);

}

Using preOrderTraverse

We can rewrite the toString method to use the preOrderTraverse method. The preOrderTraverse method visits each node in preorder applying the statement block specified in the lambda expression passed as an argument to the preorder traversal methods. The lambda expression passed by toString to PreOrderTraverse has arguments representing the node and its depth. When PreOrderTraverse applies its statement block to a node, it creates a new StringBuilder object consisting of d (the node depth) blanks followed by the string representation of the current node and "\n".

public String toString() {

final StringBuilder sb = new StringBuilder();

preOrderTraverse((e, d) -> {

for (int i = 1; i < d; i++) {

sb.append(" ");

}

sb.append(e);

sb.append("\n");

});

return sb.toString();

EXERCISES FOR SECTION 6.4

SELF-CHECK

Describe the effect of each lambda expression below that is valid. Correct the expression if it is not a valid lambda expression and then describe its effect.

(int m, n) -> 2 * m + 3 * n

(int m, double x) -> m * (int) x

msg -> return "***" + msg + "***"

-> System.out.println("Hello")

(x, y) -> {System.out.println(x);

System.out.println("_________________");

System.out.println(y);

System.out.println("_________________");

};

(x, y) -> Math.sqrt(x * x + y * y)

(x) -> x > 0

Select a Functional Interface appropriate for each of the expressions in Self-Check Exercise 1 from Table 6.2 or the java.util.function library.

Declare a function object of the correct type for each of the lambda expressions in Self-Check Exercise 1 and assign the lambda expression to it.

PROGRAMMING

Write a lambda expression that takes a double argument (x) and an integer argument (n). The method result should be the double value raised to the power n. Do this using a Java API method and also using a loop.

Write and test a Java class that enters two numbers and displays the result of calling each lambda expression in Programming Exercise 1. Also, compare the results to make sure that they are the same.

Modify the program in Example 6.6 to use two function objects that calculate trigonometric values and display the angle and corresponding values for each function object on the same line.

Write a general postOrderTraverse method for the BinaryTree class similar to the preOrderTraverse method.

Write a general inOrderTraverse method for the BinaryTree class similar to the preOrderTraverse method.

6.5 Binary Search Trees

Overview of a Binary Search Tree

In Section 6.1, we provided the following recursive definition of a binary search tree:

A set of nodes T is a binary search tree if either of the following is true:

T is empty.

Figure 6.13 shows a binary search tree that contains the words in lowercase from the nursery rhyme “The House That Jack Built.” We can use the following algorithm to find an object in a binary search tree.

Recursive Algorithm for Searching a Binary Search Tree

1. if the root is null

2. The item is not in the tree; return null.

3. Compare the value of target, the item being sought, with root.data.

4. if they are equal

5. The target has been found, return the data at the root.

else if target is less than root.data

6. Return the result of searching the left subtree.

else

7. Return the result of searching the right subtree.

image

FIGURE 6.13 Binary Search Tree Containing All of the Words from “The House That Jack Built”

EXAMPLE 6.7

Suppose we wish to find jill in Figure 6.13. We first compare jill with lay. Because jill is less than lay, we continue the search with the left subtree and compare jill with house. Because jill is greater than house, we continue with the right subtree and compare jill with jack. Because jill is greater than jack, we continue with killed followed by kept. Now, kept has no left child, and jill is less than kept, so we conclude that jill is not in this binary search tree. (She's in a different nursery rhyme.) Follow the path shown in gray in Figure 6.14.

image

FIGURE 6.14 Looking for Jill

Performance

Searching the tree in Figure 6.14 is O(log n). However, if a tree is not very full, performance will be worse. The tree in the figure at left has only right subtrees, so searching it is O(n).

image

Interface SearchTree

As described, the binary search tree is a data structure that enables efficient insertion, search, and retrieval of information (best case is O(log n)). Table 6.3 shows a SearchTree<E> interface for a class that implements the binary search tree. The interface includes methods for insertion (add), search (boolean contains and E find), and removal (E delete and boolean remove). Next, we discuss a class BinarySearchTree<E> that implements this interface.

TABLE 6.3 The SearchTree<E> Interface

Method Behavior

boolean add(E item) Inserts item where it belongs in the tree. Returns true if item is inserted; false if it isn't (already in tree)

boolean contains(E target) Returns true if target is found in the tree

E find(E target) Returns a reference to the data in the node that is equal to target. If no such node is found, returns null

E delete(E target) Removes target (if found) from tree and returns it; otherwise, returns null

boolean remove(E target) Removes target (if found) from tree and returns true; otherwise, returns false

The BinarySearchTree Class

Next, we implement class BinarySearchTree<E extends Comparable<E>>. The type parameter specified when we create a new BinarySearchTree must implement the Comparable interface.

Table 6.4 shows the data fields declared in the class. These data fields are used to store a second result from the recursive add and delete methods that we will write for this class. Neither result can be returned directly from the recursive add or delete method because they return a reference to a tree node affected by the insertion or deletion operation. The interface for method add in Table 6.3 requires a boolean result (stored in addReturn) to indicate success or failure. Similarly, the interface for delete requires a type E result (stored in deleteReturn) that is either the item deleted or null.

TABLE 6.4 Data Fields of Class BinarySearchTree<E extends Comparable<E>>

Data Field Attribute

protected boolean addReturn Stores a second return value from the recursive add method that indicates whether the item has been inserted

protected E deleteReturn Stores a second return value from the recursive delete method that references the item that was stored in the tree

The class heading and data field declarations follow. Note that class BinarySearchTree extends class BinaryTree and implements the SearchTree interface (see Figure 6.15). Besides the data fields shown, class BinarySearchTree inherits the data field root from class BinaryTree (declared as protected) and the inner class Node<E>.

public class BinarySearchTree<E extends Comparable<E>>

extends BinaryTree<E> implements SearchTree<E> {

// Data Fields

/** Return value from the public add method. */

protected boolean addReturn;

/** Return value from the public delete method. */

protected E deleteReturn;

. . .

}

image

FIGURE 6.15 UML Diagram of BinarySearchTree

Implementing the find Methods

Earlier, we showed a recursive algorithm for searching a binary search tree. Next, we show how to implement this algorithm and a nonrecursive starter method for the algorithm. Our method find will return a reference to the node that contains the information we are seeking.

Listing 6.4 shows the code for method find. The starter method calls the recursive method with the tree root and the object being sought as its parameters. If bST is a reference to a BinarySearchTree, the method call bST.find(target) invokes the starter method.

The recursive method first tests the local root for null. If it is null, the object is not in the tree, so null is returned. If the local root is not null, the statement

int compResult = target.compareTo(localRoot.data);

compares target to the data at the local root. Recall that method compareTo returns an int value that is negative, zero, or positive depending on whether the object (target) is less than, equal to, or greater than the argument (localRoot.data).

If the objects are equal, we return the data at the local root. If target is smaller, we recursively call the method find, passing the left subtree root as the parameter.

return find(localRoot.left, target);

Otherwise, we call find to search the right subtree.

return find(localRoot.right, target);

LISTING 6.4

BinarySearchTree find Method

/** Starter method find.

pre: The target object must implement

the Comparable interface.

@param target The Comparable object being sought

@return The object, if found, otherwise null

public E find(E target) {

return find(root, target);

}

/** Recursive find method.

@param localRoot The local subtree's root

@param target The object being sought

@return The object, if found, otherwise null

private E find(Node<E> localRoot, E target) {

if (localRoot == null)

return null;

// Compare the target with the data field at the root.

int compResult = target.compareTo(localRoot.data);

if (compResult == 0)

return localRoot.data;

else if (compResult < 0)

return find(localRoot.left, target);

else

return find(localRoot.right, target);

Insertion into a Binary Search Tree

Inserting an item into a binary search tree follows a similar algorithm as searching for the item because we are trying to find where in the tree the item would be, if it were there. In searching, a result of null is an indicator of failure; in inserting, we replace this null with a new leaf that contains the new item. If we reach a node that contains the object we are trying to insert, then we can't insert it (duplicates are not allowed), so we return false to indicate that we were unable to perform the insertion. The insertion algorithm follows.

Recursive Algorithm for Insertion in a Binary Search Tree

1. if the root is null

2. Replace empty tree with a new tree with the item at the root and return true.

3. else if the item is equal to root.data

4. The item is already in the tree; return false.

5. else if the item is less than root.data

6. Recursively insert the item in the left subtree.

7. else

8. Recursively insert the item in the right subtree.

The algorithm returns true when the new object is inserted and false if it is a duplicate (the second stopping case). The first stopping case tests for an empty tree. If so, a new BinarySearchTree is created and the new item is stored in its root node (Step 2).

EXAMPLE 6.8

To insert jill into Figure 6.13, we would follow the steps shown in Example 6.7 except that when we reached kept, we would insert jill as the left child of the node that contains kept (see Figure 6.16).

image

FIGURE 6.16 Inserting Jill

Implementing the add Methods

Listing 6.5 shows the code for the starter and recursive add methods. The recursive add follows the algorithm presented earlier, except that the return value is the new (sub)tree that contains the inserted item. The data field addReturn is set to true if the item is inserted and to false if the item already exists. The starter method calls the recursive method with the root as its argument. The root is set to the value returned by the recursive method (the modified tree). The value of addReturn is then returned to the caller.

In the recursive method, the statements

addReturn = true;

return new Node<>(item);

execute when a null branch is reached. The first statement sets the insertion result to true; the second returns a new node containing item as its data.

The statements

addReturn = false;

return localRoot;

execute when item is reached. The first statement sets the insertion result to false; the second returns a reference to the subtree that contains item in its root.

If item is less than the root's data, the statement

localRoot.left = add(localRoot.left, item);

attempts to insert item in the left subtree of the local root. After returning from the call, this left subtree is set to reference the modified subtree, or the original subtree if there is no insertion. The statement

localRoot.right = add(localRoot.right, item);

affects the right subtree of localRoot in a similar way.

LISTING 6.5

BinarySearchTree add Methods

/** Starter method add.

pre: The object to insert must implement the

Comparable interface.

@param item The object being inserted

@return true if the object is inserted, false

if the object already exists in the tree

public boolean add(E item) {

root = add(root, item);

return addReturn;

}

/** Recursive add method.

post: The data field addReturn is set true if the item is added to the tree, false if the item is already in the tree.

@param localRoot The local root of the subtree

@param item The object to be inserted

@return The new local root that now contains the

inserted item

private Node<E> add(Node<E> localRoot, E item) {

if (localRoot == null) {

// item is not in the tree — insert it.

addReturn = true;

return new Node<>(item);

} else if (item.compareTo(localRoot.data) == 0) {

// item is equal to localRoot.data

addReturn = false;

return localRoot;

} else if (item.compareTo(localRoot.data) < 0) {

// item is less than localRoot.data

localRoot.left = add(localRoot.left, item);

return localRoot;

} else {

// item is greater than localRoot.data

localRoot.right = add(localRoot.right, item);

return localRoot;

}

image PROGRAM STYLE

Comment on Insertion Algorithm and add Methods

Note that as we return along the search path, the statement

localRoot.left = add(localRoot.left, item);

localRoot.right = add(localRoot.right, item);

resets each local root to reference the modified tree below it. You may wonder whether this is necessary. The answer is “No.” In fact, it is only necessary to reset the reference from the parent of the new node to the new node; all references above the parent remain the same. We can modify the insertion algorithm to do this by checking for a leaf node before making the recursive call to add:

5.1. else if the item is less than root.data

5.2. if the local root is a leaf node.

5.3. Reset the left subtree to reference a new node with the item as its data.

else

5.4. Recursively insert the item in the left subtree.

A similar change should be made for the case where item is greater than the local root's data. You would also have to modify the starter add method to check for an empty tree and insert the new item in the root node if the tree is empty instead of calling the recursive add method.

One reason we did not write the algorithm this way is that we want to be able to adjust the tree if the insertion makes it unbalanced. This involves resetting one or more branches above the insertion point. We discuss how this is done in Chapter 9.

image PROGRAM STYLE

Multiple Calls to compareTo

Method add has two calls to method compareTo. We wrote it this way so that the code mirrors the algorithm. However, it would be more efficient to call compareTo once and save the result in a local variable as we did for method find. Depending on the number and type of data fields being compared, the extra call to method compareTo could be costly.

Removal from a Binary Search Tree

Removal also follows the search algorithm except that when the item is found, it is removed. If the item is a leaf node, then its parent's reference to it is set to null, thereby removing the leaf node. If the item has only a left or right child, then the grandparent references the remaining child instead of the child's parent (the node we want to remove).

EXAMPLE 6.9

If we remove is from Figure 6.13, we can replace it with in. This is accomplished by changing the left child reference in jack (the grandparent) to reference in (see Figure 6.17).

image

FIGURE 6.17 Removing is

A complication arises when the item we wish to remove has two children. In this case, we need to find a replacement parent for the children. Remember that the parent must be larger than all of the data fields in the left subtree and smaller than all of the data fields in the right subtree. If we take the largest item in the left subtree and promote it to be the parent, then all of the remaining items in the left subtree will be smaller. This item is also less than the items in the right subtree. This item is also known as the inorder predecessor of the item being removed. (We could use the inorder successor instead; this is discussed in the exercises.)

EXAMPLE 6.10

If we remove house from Figure 6.13, we look in the left subtree (root contains cow) for the largest item, horn. We then replace house with horn and remove the node containing horn (see Figure 6.18).

image

FIGURE 6.18 Removing house

EXAMPLE 6.11

If we want to remove rat from the tree in Figure 6.13, we would start the search for the inorder successor at milked and see that it has a right child, priest. If we now look at priest, we see that it does not have a right child, but it does have a left child. We would then replace rat with priest and replace the reference to priest in milked with a reference to morn (the left subtree of the node containing priest). See Figure 6.19.

image

FIGURE 6.19 Removing rat

Recursive Algorithm for Removal from a Binary Search Tree

1. if the root is null

2. The item is not in tree – return null.

3. Compare the item to the data at the local root.

4. if the item is less than the data at the local root

5. Return the result of deleting from the left subtree.

6. else if the item is greater than the local root

7. Return the result of deleting from the right subtree.

8. else // The item is in the local root

9. Store the data in the local root in deleteReturn.

10. if the local root has no children

11. Set the parent of the local root to reference null.

12. else if the local root has one child

13. Set the parent of the local root to reference that child.

14. else // Find the inorder predecessor

15. if the left child has no right child it is the inorder predecessor

16. Set the parent of the local root to reference the left child.

17. else

18. Find the rightmost node in the right subtree of the left child.

19. Copy its data into the local root's data and remove it by setting its parent to reference its left child.

Implementing the delete Methods

Listing 6.6 shows both the starter and the recursive delete methods. As with the add method, the recursive delete method returns a reference to a modified tree that, in this case, no longer contains the item. The public starter method is expected to return the item removed. Thus, the recursive method saves this value in the data field deleteReturn before removing it from the tree. The starter method then returns this value.

LISTING 6.6

BinarySearchTree delete Methods

/** Starter method delete.

post: The object is not in the tree.

@param target The object to be deleted

@return The object deleted from the tree

or null if the object was not in the tree

@throws ClassCastException if target does not implement

Comparable

public E delete(E target) {

root = delete(root, target);

return deleteReturn;

}

/** Recursive delete method.

post: The item is not in the tree;

deleteReturn is equal to the deleted item

as it was stored in the tree or null

if the item was not found.

@param localRoot The root of the current subtree

@param item The item to be deleted

@return The modified local root that does not contain

the item

private Node<E> delete(Node<E> localRoot, E item) {

if (localRoot == null) {

// item is not in the tree.

deleteReturn = null;

return localRoot;

}

// Search for item to delete.

int compResult = item.compareTo(localRoot.data);

if (compResult < 0) {

// item is smaller than localRoot.data.

localRoot.left = delete(localRoot.left, item);

return localRoot;

} else if (compResult > 0) {

// item is larger than localRoot.data.

localRoot.right = delete(localRoot.right, item);

return localRoot;

} else {

// item is at local root.

deleteReturn = localRoot.data;

if (localRoot.left == null) {

// If there is no left child, return right child

// which can also be null.

return localRoot.right;

} else if (localRoot.right == null) {

// If there is no right child, return left child.

return localRoot.left;

} else {

// Node being deleted has 2 children, replace the data

// with inorder predecessor.

if (localRoot.left.right == null) {

// The left child has no right child.

// Replace the data with the data in the

// left child.

localRoot.data = localRoot.left.data;

// Replace the left child with its left child.

localRoot.left = localRoot.left.left;

return localRoot;

} else {

// Search for the inorder predecessor (ip) and

// replace deleted node's data with ip.

localRoot.data = findLargestChild(localRoot.left);

return localRoot;

}

For the recursive method, the two stopping cases are an empty tree and a tree whose root contains the item being removed. We first test to see whether the tree is empty (local root is null). If so, then the item sought is not in the tree. The deleteReturn data field is set to null, and the local root is returned to the caller.

Next, localRoot.data is compared to the item to be deleted. If the item to be deleted is less than localRoot.data, it must be in the left subtree if it is in the tree at all, so we set localRoot.left to the value returned by recursively calling this method.

localRoot.left = delete(localRoot.left, item);

If the item to be deleted is greater than localRoot.data, the statement

localRoot.right = delete(localRoot.right, item);

affects the right subtree of localRoot in a similar way.

If localRoot.data is the item to be deleted, we have reached the second stopping case, which begins with the lines

} else {

// item is at local root.

deleteReturn = localRoot.data;

. . .

The value of localRoot.data is saved in deleteReturn. If the node to be deleted has one child (or zero children), we return a reference to the only child (or null), so the parent of the deleted node will reference its only grandchild (or null).

If the node to be deleted (jack in the figure at left) has two children, we need to find the replacement for this node. If its left child has no right subtree, the left child (is) is the inorder predecessor. The first statement below

localRoot.data = localRoot.left.data;

// Replace the left child with its left child.

localRoot.left = localRoot.left.left;

image

copies the left child's data into the local node's data (is to jack); the second resets the local node's left branch to reference its left child's left subtree (in).

If the left child of the node to be deleted has a right subtree, the statement

localRoot.data = findLargestChild(localRoot.left);

calls findLargestChild to find the largest child and to remove it. The largest child's data is referenced by localRoot.data. This is illustrated in Figure 6.19. The left child milked of the node to be deleted (rat) has a right child priest, which is its largest child. Therefore, priest becomes referenced by localRoot.data (replacing rat) and morn (the left child of priest) becomes the new right child of milked.

Method findLargestChild

Method findLargestChild (see Listing 6.7) takes the parent of a node as its argument. It then follows the chain of rightmost children until it finds a node whose right child does not itself have a right child. This is done via tail recursion.

When a parent node is found whose right child has no right child, the right child is the inorder predecessor of the node being deleted, so the data value from the right child is saved.

E returnValue = parent.right.data;

parent.right = parent.right.left;

The right child is then removed from the tree by replacing it with its left child (if any).

LISTING 6.7

BinarySearchTreefindLargestChild Method

/** Find the node that is the

inorder predecessor and replace it

with its left child (if any).

post: The inorder predecessor is removed from the tree.

@param parent The parent of possible inorder

predecessor (ip)

@return The data in the ip

private E findLargestChild(Node<E> parent) {

// If the right child has no right child, it is

// the inorder predecessor.

if (parent.right.right == null) {

E returnValue = parent.right.data;

parent.right = parent.right.left;

return returnValue;

} else {

return findLargestChild(parent.right);

}

Testing a Binary Search Tree

To test a binary search tree, you need to verify that an inorder traversal will display the tree contents in ascending order after a series of insertions (to build the tree) and deletions are performed. You need to write a toString method for a BinarySearchTree that returns the String built from an inorder traversal (see Programming Exercise 3).

CASE STUDY Writing an Index for a Term Paper

Problem

You would like to write an index for a term paper. The index should show each word in the paper followed by the line number on which it occurred. The words should be displayed in alphabetical order. If a word occurs on multiple lines, the line numbers should be listed in ascending order. For example, the three lines

a, 3

a, 13

are, 3

show that the word a occurred on lines 3 and 13 and the word are occurred on line 3.

Analysis

A binary search tree is an ideal data structure to use for storing the index entries. We can store each word and its line number as a string in a tree node. For example, the two occurrences of the word Java on lines 5 and 10 could be stored as the strings "java, 005" and "java, 010". Each word will be stored in lowercase to ensure that it appears in its proper position in the index. The leading zeros are necessary so that the string "java, 005" is considered less than the string "java, 010". If the leading zeros were removed, this would not be the case ("java, 5" is greater than "java, 10"). After all the strings are stored in the search tree, we can display them in ascending order by performing an inorder traversal. Storing each word in a search tree is an O(log n) process where n is the number of words currently in the tree. Storing each word in an ordered list would be an O(n) process.

Design

We can represent the index as an instance of the BinarySearchTree class just discussed or as an instance of a binary search tree provided in the Java API. The Java API provides a class TreeSet<E> (discussed further in Section 7.1) that uses a binary search tree as its basis. Class TreeSet<E> provides three of the methods in interface SearchTree: insertion (add), search (boolean contains), and removal (boolean remove). It also provides an iterator that enables inorder access to the elements of a tree. Because we are only doing tree insertion and inorder access, we will use class TreeSet<E>.

We will write a class IndexGenerator (see Table 6.5) with a TreeSet<String> data field. Method buildIndex will read each word from a data file and store it in the search tree. Method showIndex will display the index.

TABLE 6.5 Data Fields and Methods of Class IndexGenerator

Data Field Attribute

private TreeSet<String> index The search tree used to store the index

private static final String PATTERN Pattern for extracting words from a line. A word is a string of one or more letters or numbers or characters

Method Behavior

public void buildIndex(Scanner scan) Reads each word from the file scanned by scan and stores it in tree index

public void showIndex() Performs an inorder traversal of tree index

Implementation

Listing 6.8 shows class IndexGenerator. In method buildIndex, the repetition condition for the outer while loop calls method hasNextLine, which scans the next data line into a buffer associated with Scanner scan or returns null (causing loop exit) if all lines were scanned. If the next line is scanned, the repetition condition for the inner while loop below

while ((token = scan.findInLine(PATTERN)) != null) {

token = token.toLowerCase();

index.add(String.format("%s, %3d", token, lineNum));

}

calls Scanner method findInLine to extract a token from the buffer (a sequence of letters, digits, and the apostrophe character). Next, it inserts in index a string consisting of the next token in lowercase followed by a comma, a space, and the current line number formatted with leading spaces so that it occupies a total of three columns. This format is prescribed by the first argument "%s, %3d" passed to method String.format (see Appendix A.5). The inner loop repeats until findInLine returns null, at which point the inner loop is exited, the buffer is emptied by the statement

scan.nextLine(); // Clear the scan buffer

and the outer loop is repeated.

LISTING 6.8

Class IndexGenerator.java

import java.io.*;

import java.util.*;

/** Class to build an index. */

public class IndexGenerator {

// Data Fields

/** Tree for storing the index. */

private final TreeSet<String> index;

/** Pattern for extracting words from a line. A word is a string of

one or more letters or numbers or ' characters */

private static final String PATTERN =

"[\\p{L}\\p{N}']+";

// Methods

public IndexGenerator() {

index = new TreeSet<>();

}

/** Reads each word in a data file and stores it in an index

along with its line number.

post: Lowercase form of each word with its line

number is stored in the index.

@param scan A Scanner object

public void buildIndex(Scanner scan) {

int lineNum = 0; // line number

// Keep reading lines until done.

while (scan.hasNextLine()) {

lineNum++;

// Extract each token and store it in index.

String token;

while ((token = scan.findInLine(PATTERN)) != null) {

token = token.toLowerCase();

index.add(String.format("%s, %3d", token, lineNum));

}

scan.nextLine(); // Clear the scan buffer

}

/** Displays the index, one word per line. */

public void showIndex() {

index.forEach(next -> System.out.println(next));

}

Method showIndex at the end of Listing 6.8 uses the the default method forEach to display each line of the index. We describe the forEach in the next syntax box. Without the forEach, we could use the enhanced for loop below with an iterator.

public void showIndex() {

// Use an iterator to access and display tree data.

for (String next : index) {

System.out.println(next);

}

image SYNTAX Using The Java 8 forEach statement

FORM

iterable.forEach(lambda expression);

EXAMPLE

index.forEach(next -> System.out.println(next));

INTERPRETATION

Java 8 added the default method forEach to the Iterable interface. A default method enables you to add new functionality to an interface while still retaining compatibility with earlier implementations that did not provide this method. The forEach method applies a method (represented by lambda expression) to each item of an Iterable object. Since the Set interface extends the Iterable interface and TreeSet implements Set, we can use the forEach method on the index as shown in the example above.

Testing

To test class IndexGenerator, write a main method that declares new Scanner and IndexGenerator<String> objects. The Scanner can reference any text file stored on your hard drive. Make sure that duplicate words are handled properly (including duplicates on the same line), that words at the end of each line are stored in the index, that empty lines are processed correctly, and that the last line of the document is also part of the index.

EXERCISES FOR SECTION 6.5

SELF-CHECK

Show the tree that would be formed for the following data items. Exchange the first and last items in each list, and rebuild the tree that would be formed if the items were inserted in the new order.

happy, depressed, manic, sad, ecstatic

45, 30, 15, 50, 60, 20, 25, 90

Explain how the tree shown in Figure 6.13 would be changed if you inserted mother. If you inserted jane? Does either of these insertions change the height of the tree?

Show or explain the effect of removing the nodes kept, cow from the tree in Figure 6.13.

In Exercise 3 above, a replacement value must be chosen for the node cow because it has two children. What is the relationship between the replacement word and the word cow? What other word in the tree could also be used as a replacement for cow? What is the relationship between that word and the word cow?

The algorithm for deleting a node does not explicitly test for the situation where the node being deleted has no children. Explain why this is not necessary.

In Step 19 of the algorithm for deleting a node, when we replace the reference to a node that we are removing with a reference to its left child, why is it not a concern that we might lose the right subtree of the node that we are removing?

PROGRAMMING

Write methods contains and remove for the BinarySearchTree class. Use methods find and delete to do the work.

Self-Check Exercise 4 indicates that two items can be used to replace a data item in a binary search tree. Rewrite method delete so that it retrieves the leftmost element in the right subtree instead. You will also need to provide a method findSmallestChild.

Write a main method to test a binary search tree. Write a toString method that returns the tree contents in ascending order (using an inorder traversal) with newline characters separating the tree elements.

Write a main method for the index generator that declares new Scanner and IndexGenerator objects. The Scanner can reference any text file stored on your hard drive.

6.6 Heaps and Priority Queues

In this section, we discuss a binary tree that is ordered but in a different way from a binary search tree. At each level of a heap, the value in a node is less than all values in its two subtrees. Figure 6.20 shows an example of a heap. Observe that 6 is the smallest value. Observe that each parent is smaller than its children and that each parent has two children, with the exception of node 39 at level 3 and the leaves. Furthermore, with the exception of 66, all leaves are at the lowest level. Also, 39 is the next-to-last node at level 3, and 66 is the last (rightmost) node at level 3.

image

FIGURE 6.20 Example of a Heap

More formally, a heap is a complete binary tree with the following properties:

The value in the root is the smallest item in the tree.

Every subtree is a heap.

Inserting an Item into a Heap

We use the following algorithm for inserting an item into a heap. Our approach is to place each item initially in the bottom row of the heap and then move it up until it reaches the position where it belongs.

Algorithm for Inserting in a Heap

1. Insert the new item in the next position at the bottom of the heap.

2. while new item is not at the root and new item is smaller than its parent

3. Swap the new item with its parent, moving the new item up the heap.

New items are added to the last row (level) of a heap. If a new item is larger than or equal to its parent, nothing more need be done. If we insert 89 in the heap in Figure 6.20, 89 would become the right child of 39 and we are done. However, if the new item is smaller than its parent, the new item and its parent are swapped. This is repeated up the tree until the new item is in a position where it is no longer smaller than its parent. For example, let's add 8 to the heap shown in Figure 6.21. Since 8 is smaller than 66, these values are swapped as shown in Figure 6.22. Also, 8 is smaller than 29, so these values are swapped resulting in the updated heap shown in Figure 6.23. But 8 is greater than 6, so we are done.

image

FIGURE 6.21 Inserting 8 into a Heap

image

FIGURE 6.22 Swapping 8 and 66

image

FIGURE 6.23 Swapping 8 and 29

Removing an Item from a Heap

Removal from a heap is always from the top. The top item is first replaced with the last item in the heap (at the lower right-hand position) so that the heap remains a complete tree. If we used any other value, there would be a “hole” in the tree where that value used to be. Then the new item at the top is moved down the heap until it is in its proper position.

Algorithm for Removal from a Heap

1. Remove the item in the root node by replacing it with the last item in the heap (LIH).

2. while item LIH has children, and item LIH is larger than either of its children

3. Swap item LIH with its smaller child, moving LIH down the heap.

As an example, if we remove 6 from the heap shown in Figure 6.23, 66 replaces it as shown in Figure 6.24. Since 66 is larger than both of its children, it is swapped with the smaller of the two, 8, as shown in Figure 6.25. The result is still not a heap because 66 is larger than both its children. Swapping 66 with its smaller child, 29, restores the heap as shown in Figure 6.26.

image

FIGURE 6.24 After Removal of 6

image

FIGURE 6.25 Swapping 66 and 8

image

FIGURE 6.26 Swapping 66 and 29

Implementing a Heap

Because a heap is a complete binary tree, we can implement it efficiently using an array (or ArrayList) instead of a linked data structure. We can use the first element (subscript 0) for storing a reference to the root data. We can use the next two elements (subscripts 1 and 2) for storing the two children of the root. We can use elements with subscripts 3, 4, 5, and 6 for storing the four children of these two nodes, and so on. Therefore, we can view a heap as a sequence of rows; each row is twice as long as the previous row. The first row (the root) has one item, the second row two, the third four, and so on. All of the rows are full except for the last one (see Figure 6.27).

image

FIGURE 6.27 Internal Representation of the Heap

Observe that the root, 6, is at position 0. The root's two children, 18 and 29, are at positions 1 and 2. For a node at position p, the left child is at 2p + 1 and the right child is at 2p + 2. A node at position c can find its parent at (c – 1) / 2. Thus, as shown in Figure 6.27, children of 28 (at position 4) are at positions 9 and 10.

Insertion into a Heap Implemented as an ArrayList

We will use an ArrayList for storing our heap because it is easier to expand and contract than an array. Figure 6.28 shows the heap after inserting 8 into position 13. This corresponds to inserting the new value into the lower right position as shown in the figure, right. Now we need to move 8 up the heap, by comparing it to the values stored in its ancestor nodes. The parent (66) is in position 6 (13 minus 1 is 12, divided by 2 is 6). Since 66 is larger than 8, we need to swap as shown in Figure 6.29.

image

FIGURE 6.28 Internal Representation of Heap after Insertion

image

FIGURE 6.29 Internal Representation of Heap after First Swap

Now the child is at position 6 and the parent is at position 2 (6 minus 1 is 5, divided by 2 is 2). Since the parent, 29, is larger than the child, 8, we must swap again as shown in Figure 6.30.

image

FIGURE 6.30 Internal Representation of Heap after Second Swap

The child is now at position 2 and the parent is at position 0. Since the parent is smaller than the child, the heap property is restored. In the heap insertion and removal algorithms that follow, we will use table to reference the ArrayList that stores the heap. We will use table[index] to represent the element at position index of table. In the actual code, a subscript cannot be used with an ArrayList.

Insertion of an Element into a Heap Implemented as an ArrayList

1. Insert the new element at the end of the ArrayList and set child to table.size() – 1.

2. Set parent to (child – 1) / 2.

3. while (parent >= 0 and table[parent] > table[child])

4. Swap table[parent] and table[child].

5. Set child equal to parent.

6. Set parent equal to (child – 1) / 2.

Removal from a Heap Implemented as an ArrayList

In removing elements from a heap, we must always remove and save the element at the top of the heap, which is the smallest element. We start with an ArrayList that has been organized to form a heap. To remove the first item (6), we begin by replacing the first item with the last item and then removing the last item. This is illustrated in Figure 6.31. The new value of the root (position 0) is larger than both of its children (18 in position 1 and 8 in position 2). The smaller of the two children (8 in position 2) is swapped with the parent as shown in Figure 6.32. Next, 66 is swapped with the smaller of its two new children (29), and the heap is restored (Figure 6.33).

image

FIGURE 6.31 Internal Representation of Heap after 6 Is Removed

image

FIGURE 6.32 Internal Representation of Heap after 8 and 66 Are Swapped

image

FIGURE 6.33 Internal Representation of Heap after Swap of 66 and 29

The algorithm for removal from a heap implemented as an ArrayList follows.

Removing an Element from a Heap Implemented as an ArrayList

1. Remove the last element (i.e., the one at size() – 1) and set the item at 0 to this value.

2. Set parent to 0.

3. while (true)

4. Set leftChild to (2 * parent) + 1 and rightChild to leftChild + 1.

5. if leftChild >= table.size()

6. Break out of loop.

7. Assume minChild (the smaller child) is leftChild.

8. if rightChild < table.size() and table[rightChild] < table[leftChild]

9. Set minChild to rightChild.

10. if table[parent] > table[minChild]

11. Swap table[parent] and table[minChild].

12. Set parent to minChild.

else

13. Break out of loop.

The loop (Step 3) is terminated under one of two circumstances: either the item has moved down the tree so that it has no children (line 5 is true), or it is smaller than both its children (line 10 is false). In these cases, the loop terminates (line 6 or 13). This is shown in Figure 6.33. At this point the heap property is restored, and the next smallest item can be removed from the heap.

Performance of the Heap

Method remove traces a path from the root to a leaf, and method insert traces a path from a leaf to the root. This requires at most h steps, where h is the height of the tree. The largest heap of height h is a full tree of height h. This tree has 2h – 1 nodes. The smallest heap of height h is a complete tree of height h, consisting of a full tree of height h – 1, with a single node as the left child of the leftmost child at height h – 1. Thus, this tree has 2(h – 1) nodes. Therefore, both insert and remove are O(log n) where n is the number of items in the heap.

Priority Queues

In computer science, a heap is used as the basis of a very efficient algorithm for sorting arrays, called heapsort, which you will study in Chapter 8. The heap is also used to implement a special kind of queue called a priority queue. However, the heap is not very useful as an abstract data type (ADT) on its own. Consequently, we will not create a Heap interface or code a class that implements it. Instead we will incorporate its algorithms when we implement a priority queue class and heapsort.

Sometimes a FIFO (first-in-first-out) queue may not be the best way to implement a waiting line. In a print queue, you might want to print a short document before some longer documents that were ahead of the short document in the queue. For example, if you were waiting by the printer for a single page to print, it would be very frustrating to have to wait until several documents of 50 pages or more were printed just because they entered the queue before yours did. Therefore, a better way to implement a print queue would be to use a priority queue. A priority queue is a data structure in which only the highest priority item is accessible. During insertion, the position of an item in the queue is based on its priority relative to the priorities of other items in the queue. If a new item has higher priority than all items currently in the queue, it will be placed at the front of the queue and, therefore, will be removed before any of the other items inserted in the queue at an earlier time. This violates the FIFO property of an ordinary queue.

EXAMPLE 6.12

Figure 6.34 sketches a print queue that at first (top of diagram) contains two documents. We will assume that each document's priority is inversely proportional to its page count (priorityis1page count)

. The middle queue shows the effect of inserting a document three pages long. The bottom queue shows the effect of inserting a second one-page document. It follows the earlier document with that page length.

image

FIGURE 6.34 Insertion into a Priority Queue

The PriorityQueue Class

Java provides a PriorityQueue<E> class that implements the Queue<E> interface given in Chapter 4. The differences are in the specification for the peek, poll, and remove methods. These are defined to return the smallest item in the queue rather than the oldest item in the queue. Table 6.6 summarizes the methods of the PriorityQueue<E> class.

TABLE 6.6 Methods of the PriorityQueue<E> Class

Method Behavior

boolean offer(E item) Inserts an item into the queue. Returns true if successful; returns false if the item could not be inserted

E remove() Removes the smallest entry and returns it if the queue is not empty. If the queue is empty, throws a NoSuchElementException

E poll() Removes the smallest entry and returns it. If the queue is empty, returns null

E peek() Returns the smallest entry without removing it. If the queue is empty, returns null

E element() Returns the smallest entry without removing it. If the queue is empty, throws a NoSuchElementException

Using a Heap as the Basis of a Priority Queue

The smallest item is always removed first from a priority queue (the smallest item has the highest priority) just as it is for a heap. Because insertion into and removal from a heap is O(log n), a heap can be the basis for an efficient implementation of a priority queue. We will call our class KWPriorityQueue to differentiate it from class PriorityQueue in the java.util API, which also uses a heap as the basis of its implementation.

A key difference is that class java.util.PriorityQueue class uses an array of type Object[] for heap storage. We will use an ArrayList for storage in KWPriorityQueue because the size of an ArrayList automatically adjusts as elements are inserted and removed. To insert an item into the priority queue, we first insert the item at the end of the ArrayList. Then, following the algorithm described earlier, we move this item up the heap until it is smaller than its parent.

To remove an item from the priority queue, we take the first item from the ArrayList; this is the smallest item. We then remove the last item from the ArrayList and put it into the first position of the ArrayList, overwriting the value currently there. Then, following the algorithm described earlier, we move this item down until it is smaller than its children or it has no children.

Design of KWPriorityQueue Class

The design of the KWPriorityQueue<E> class is shown in Table 6.7. The data field theData is used to store the heap. We discuss the purpose of data field comparator shortly. We have added methods compare and swap to those shown earlier in Table 6.6. Method compare compares its two arguments and returns a type int value indicating their relative ordering. The class heading and data field declarations follow.

TABLE 6.7 Design of KWPriorityQueue<E> Class

Data Field Attribute

ArrayList<E> theData An ArrayList to hold the data

Comparator<E> comparator An optional object that implements the Comparator<E> interface by providing a compare method

Method Behavior

KWPriorityQueue() Constructs a heap-based priority queue that uses the elements' natural ordering

KWPriorityQueue

(Comparator<E> comp) Constructs a heap-based priority queue that uses the compare method of Comparator comp to determine the ordering of the elements

private int compare(E left,E right) Compares two objects and returns a negative number if object left is less than object right, zero if they are equal, and a positive number if object left is greater than object right

private void swap(int i, int j) Exchanges the object references in theData at indexes i and j

import java.util.*;

/** The KWPriorityQueue implements the Queue interface

by building a heap in an ArrayList. The heap is structured

so that the “smallest” item is at the top.

public class KWPriorityQueue<E> extends AbstractQueue<E>

implements Queue<E> {

// Data Fields

/** The ArrayList to hold the data. */

private ArrayList<E> theData;

/** An optional reference to a Comparator object. */

Comparator<E> comparator = null;

// Methods

// Constructor

public KWPriorityQueue() {

theData = new ArrayList<>();

}

. . .

The offer Method

The offer method appends the new item to the ArrayList theData. It then moves this item up the heap until the ArrayList is restored to a heap.

/** Insert an item into the priority queue.

pre: The ArrayList theData is in heap order.

post: The item is in the priority queue and

theData is in heap order.

@param item The item to be inserted

@throws NullPointerException if the item to be inserted is null.

@Override

public boolean offer(E item) {

// Add the item to the heap.

theData.add(item);

// child is newly inserted item.

int child = theData.size() - 1;

int parent = (child - 1) / 2; // Find child's parent.

// Reheap

while (parent >= 0 && compare(theData.get(parent),

theData.get(child)) > 0) {

swap(parent, child);

child = parent;

parent = (child - 1) / 2;

}

return true;

}

The poll Method

The poll method first saves the item at the top of the heap. If there is more than one item in the heap, the method removes the last item from the heap and places it at the top. Then it moves the item at the top down the heap until the heap property is restored. Next it returns the original top of the heap.

/** Remove an item from the priority queue

pre: The ArrayList theData is in heap order.

post: Removed smallest item, theData is in heap order.

@return The item with the smallest priority value or null if empty.

@Override

public E poll() {

if (isEmpty()) {

return null;

}

// Save the top of the heap.

E result = theData.get(0);

// If only one item then remove it.

if (theData.size() == 1) {

theData.remove(0);

return result;

}

/* Remove the last item from the ArrayList and place it into

the first position. */

theData.set(0, theData.remove(theData.size() - 1));

// The parent starts at the top.

int parent = 0;

while (true) {

int leftChild = 2 * parent + 1;

if (leftChild >= theData.size()) {

break; // Out of heap.

}

int rightChild = leftChild + 1;

int minChild = leftChild; // Assume leftChild is smaller.

// See whether rightChild is smaller.

if (rightChild < theData.size()

&& compare(theData.get(leftChild),

theData.get(rightChild)) > 0) {

minChild = rightChild;

}

// assert: minChild is the index of the smaller child.

// Move smaller child up heap if necessary.

if (compare(theData.get(parent),

theData.get(minChild)) > 0) {

swap(parent, minChild);

parent = minChild;

} else { // Heap property is restored.

break;

}

return result;

}

The Other Methods

The iterator and size methods are implemented via delegation to the corresponding ArrayList methods. Method isEmpty tests whether the result of calling method size is 0 and is inherited from class AbstractCollection (a super interface to AbstractQueue). Methods peek and remove (based on poll) must also be implemented; they are left as exercises. Methods add and element are inherited from AbstractQueue where they are implemented by calling methods offer and peek, respectively.

Using a Comparator

How do we compare elements in a PriorityQueue? In many cases, we will insert objects that implement Comparable<E> and use their natural ordering as specified by method compareTo. However, we may need to insert objects that do not implement Comparable<E>, or we may want to specify a different ordering from that defined by the object's compareTo method. For example, files to be printed may be ordered by their name using the compareTo method, but we may want to assign priority based on their length. The Java API contains the Comparator<E> interface, which allows us to specify alternative ways to compare objects. An implementer of the Comparator<E> interface must define a compare method that is similar to compareTo except that it has two parameters (see Table 6.7).

To indicate that we want to use an ordering that is different from the natural ordering for the objects in our heap, we will provide a constructor that has a Comparator<E> parameter. The constructor will set data field comparator to reference this parameter. Otherwise, comparator will remain null. To match the form of this constructor in the java.util.PriorityQueue class, we provide a first parameter that specifies the initial capacity of ArrayList theData.

/** Creates a heap-based priority queue with the specified initial

capacity that orders its elements according to the specified

comparator.

@param cap The initial capacity for this priority queue

@param comp The comparator used to order this priority queue

@throws IllegalArgumentException if cap is less than 1

public KWPriorityQueue(int cap, Comparator<E> comp) {

if (cap < 1)

throw new IllegalArgumentException();

theData = new ArrayList<>();

comparator = comp;

}

The compare Method

If data field comparator references a Comparator<E> object, method compare will delegate the task of comparing its argument objects to that object's compare method. If comparator is null, the natural ordering of the objects should be used, so method compare will delegate to method compareTo. Note that parameter left is cast to type Comparable<E> in this case. In the next example, we show how to write a Comparator class.

/** Compare two items using either a Comparator object's compare method

or their natural ordering using method compareTo.

@pre: If comparator is null, left and right implement Comparable<E>.

@param left One item

@param right The other item

@return Negative int if left less than right,

0 if left equals right,

positive int if left > right

@throws ClassCastException if items are not Comparable

@SuppressWarnings("unchecked") private int compare(E left, E right) {

if (comparator != null) { // A Comparator is defined.

return comparator.compare(left, right);

} else { // Use left's compareTo method.

return ((Comparable<E>) left).compareTo(right);

}

EXAMPLE 6.13

The class PrintDocument is used to define documents to be printed on a printer. This class implements the Comparable interface, but the result of its compareTo method is based on the name of the file being printed. The class also has a getSize method that gives the number of bytes to be transmitted to the printer and a getTimeStamp method that gets the time that the print job was submitted. Instead of basing the ordering on file names, we want to order the documents by a value that is a function of both size and the waiting time of a document. If we were to use either time or size alone, small documents could be delayed while big ones are printed, or big documents would never be printed. By using a priority value that is a combination, we achieve a balanced usage of the printer.

In Java 8, the Comparator interface is also defined as a functional interface (see Table 6.2). Its abstract method compare takes two arguments of the same type and returns an integer indicating their ordering. We can pass a lambda expression as a function parameter to the constructor that creates a KWPriorityQueue object. This method will implement the abstract compare method and determine the ordering of objects in the print queue.

The compare method for printQueue specified in the following fragment uses the weighted sum of the size and time stamp for documents left and right using the weighting factors P1 and P2. Method Double.compare in this fragment compares two double values and returns a negative value, 0, or a positive value depending on whether leftValue is <, equal to, or > rightValue.

final double P1 = 0.8;

final double P2 = 0.2;

Queue<PrintDocument> printQueue =

new KWPriorityQueue<>(25, (left, right) -> {

double leftValue = P1 * left.getSize() + P2 * left.getTimeStamp();

double rightValue = P1 * right.getSize() + P2 * right.getTimeStamp();

return Double.compare(leftValue, rightValue); });

Earlier versions of Java could not implement the Comparator object by passing a lambda expression to a constructor. The textbook website discusses how to define the Comparator object before Java 8.

EXERCISES FOR SECTION 6.5

SELF-CHECK

Show the heap that would be used to store the words this, is, the, house, that, jack, built, assuming they are inserted in that sequence. Exchange the order of arrival of the first and last words and build the new heap.

Draw the heaps for Exercise 1 above as arrays.

Show the result of removing the number 18 from the heap in Figure 6.26. Show the new heap and its array representation.

The heaps in this chapter are called min heaps because the smallest key is at the top of the heap. A max heap is a heap in which each element has a key that is smaller than its parent, so the largest key is at the top of the heap. Build the max heap that would result from the numbers 15, 25, 10, 33, 55, 47, 82, 90,18 arriving in that order.

Show the printer queue after receipt of the following documents:

time stamp size

1100 256

1101 512

1102 64

1103 96

PROGRAMMING

Complete the implementation of the KWPriorityQueue class. Write method swap. Also write methods peek, remove, isEmpty, and size.

6.7 Huffman Trees

In Section 6.1, we showed the Huffman coding tree and how it can be used to decode a message. We will now implement some of the methods needed to build a tree and decode a message. We will do this using a binary tree and a PriorityQueue (which also uses a binary tree).

A straight binary coding of an alphabet assigns a unique binary number k to each symbol in the alphabet ak. An example of such a coding is Unicode, which is used by Java for the char data type. There are 65,536 possible characters, and they are assigned a number between 0 and 65,535, which is a string of 16 binary digit ones. Therefore, the length of a message would be 16 × n, where n is the total number of characters in the message. For example, the message “go eagles” contains 9 characters and would require 9 × 16 or 144 bits. As shown in the example in Section 6.1, a Huffman coding of this message requires just 38 bits.

Table 6.8, based on data published in Donald Knuth, The Art of Computer Programming, Vol 3: Sorting and Searching (Addison-Wesley, 1973), p. 441, represents the relative frequencies of the letters in English text and is the basis of the tree shown in Figure 6.35. The letter e occurs an average of 103 times every 1000 letters, or 10.3 percent of the letters are es. (This is a useful table to know if you are a fan of Wheel of Fortune.) We can use this Huffman tree to encode and decode a file of English text. However, files may contain other symbols or may contain these symbols in different frequencies from what is found in normal English. For this reason, you may want to build a custom Huffman tree based on the contents of the file you are encoding. You would from attach this tree to the encoded file so that it can be used to decode the file. We discuss how to build a Huffman tree in the next case study.

TABLE 6.8 Frequency of Letters in English Text

Symbol Frequency Symbol Frequency Symbol Frequency

⎵ 186 h 47 g 15

e 103 d 32 p 15

t 80 l 32 b 13

a 64 u 23 v 8

o 63 c 22 k 5

i 57 f 21 j 1

n 57 m 20 q 1

s 51 w 18 x 1

r 48 y 16 z 1

image

FIGURE 6.35 Huffman Tree Based on Frequency of Letters in English Text

CASE STUDY Building a Custom Huffman Tree

Problem

You want to build a custom Huffman tree for a particular file. Your input will consist of an array of objects such that each object contains a reference to a symbol occurring in that file and the frequency of occurrence (weight) for the symbol in that file.

Analysis

Each node of a Huffman tree has storage for two data items: the weight of the node and the symbol associated with that node. All symbols will be stored at leaf nodes. For nodes that are not leaf nodes, the symbol part has no meaning. The weight of a leaf node will be the frequency of the symbol stored at that node. The weight of an interior node will be the sum of frequencies of all nodes in the subtree rooted at the interior node. For example, the interior node with leaf nodes c and u (on the left of Figure 6.35) would have a weight of 45 (22 + 23).

We will use a priority queue as the key data structure in constructing the Huffman tree. We will store individual symbols and subtrees of multiple symbols in order by their priority (frequency of occurrence). We want to remove symbols that occur less frequently first because they should be lower down in the Huffman tree we are constructing. We discuss how this is done next.

To build a Huffman tree, we start by inserting references to trees with just leaf nodes in a priority queue. Each leaf node will store a symbol and its weight. The queue elements will be ordered so that the leaf node with the smallest weight (lowest frequency) is removed first. Figure 6.36 shows a priority queue, containing just the symbols a, b, c, d, and e, that uses the weights shown in Table 6.8. The item at the front of the queue stores a reference to a tree with a root node that is a leaf node containing the symbol b with a weight (frequency) of 13. To represent the tree referenced by a queue element, we list the root node information for that tree. The queue elements are shown in priority order.

image

FIGURE 6.36 Priority Queue with the Symbols a, b, c, d, and e

Now we start to build the Huffman tree. We build it from the bottom up. The first step is to remove the first two tree references from the priority queue and combine them to form a new tree. The weight of the root node for this tree will be the sum of the weights of its left and right subtrees. We insert the new tree back into the priority queue. The priority queue now contains references to four binary trees instead of five. The tree referenced by the second element of the queue has a combined weight of 35 (13 + 22) as shown on the left.

Again we remove the first two tree references and combine them. The new binary tree will have a weight of 67 in its root node. We put this tree back in the queue, and it will be referenced by the second element of the queue.

We repeat this process again. The new queue follows:

Finally, we combine the last two elements into a new tree and put a reference to it in the priority queue. Now there is only one tree in the queue, so we have finished building the Huffman tree (see Figure 6.37). Table 6.9 shows the codes for this tree.

image

FIGURE 6.37 Huffman Tree of a, b, c, d, and e

TABLE 6.9 Huffman Code for a, b, c, d, and e

Symbol Code

a 10

b 1110

c 1111

d 110

e 0

Design

The class HuffData will represent the data to be stored in each node of the Huffman binary tree. For a leaf, a HuffData object will contain the symbol and the weight.

Our class HuffmanTree will have the methods and attributes listed in Table 6.10.

TABLE 6.10 Data Fields and Methods of Class HuffmanTree

Data Field Attribute

BinaryTree<HuffData> huffTree A reference to the Huffman tree

Method Behavior

buildTree(HuffData[] input) Builds the Huffman tree using the given alphabet and weights

String decode(String message) Decodes a message using the generated Huffman tree

printCode(PrintStream out) Outputs the resulting code

Algorithm for Building a Huffman Tree

1. Construct a set of trees with root nodes that contain each of the individual symbols and their weights.

2. Place the set of trees into a priority queue.

3. while the priority queue has more than one item

4. Remove the two trees with the smallest weights.

5. Combine them into a new binary tree in which the weight of the tree root is the sum of the weights of its children.

6. Insert the newly created tree back into the priority queue.

Each time through the while loop, two nodes are removed from the priority queue and one is inserted. Thus, effectively one tree is removed, and the queue gets smaller with each pass through the loop.

Implementation

Listing 6.9 shows the data field declarations for class HuffmanTree. Method buildTree and the comparator are discussed in the next section.

LISTING 6.9

Class HuffmanTree

import java.util.*;

import java.io.*;

/** Class to represent and build a Huffman tree. */

public class HuffmanTree implements Serializable {

// Nested Classes

/** A datum in the Huffman tree. */

public static class HuffData implements Serializable {

// Data Fields

/** The weight or probability assigned to this HuffData. */

private double weight;

/** The alphabet symbol if this is a leaf. */

private char symbol;

public HuffData(double weight, Character symbol) {

this.weight = weight;

this.symbol = symbol;

}

// Data Fields

/** A reference to the completed Huffman tree. */

private BinaryTree<HuffData> huffTree;

The buildTree Method

Method buildTree (see Listing 6.10) takes an array of HuffData objects as its parameter. The statement

Queue<BinaryTree<HuffData>> theQueue

= new PriorityQueue<>(symbols.length,

(lt, rt) -> Double.compare(lt.getData().weight,

rt.getData().weight)

);

creates a new priority queue for storing BinaryTree<HuffData> objects using the PriorityQueue class in the java.util API. The constructor above takes two arguments representing the initial capacity and comparator for theQueue. The lambda expression passed as the constructor's second argument is the comparator:

(lt, rt) -> Double.compare(lt.getData().weight,

rt.getData().weight)

The enhanced for loop loads the priority queue with trees consisting just of leaf nodes. Each leaf node contains a HuffData object with the weight and alphabet symbol.

The while loop builds the tree. Each time through this loop, the trees with the smallest weights are removed and referenced by left and right. The statements

HuffData sum = new HuffData(wl + wr, '\u0000');

BinaryTree<HuffData> newTree

= new BinaryTree<>(sum, left, right);

combine them to form a new BinaryTree with a root node whose weight is the sum of the weights of its children and whose symbol is the null character '\u0000'. This new tree is then inserted into the priority queue. The number of trees in the queue decreases by 1 each time we do this. Eventually there will only be one tree in the queue, and that will be the completed Huffman tree. The last statement sets the variable huffTree to reference this tree.

LISTING 6.10

The buildTree Method (HuffmanTree.java)

/** Builds the Huffman tree using the given alphabet and weights.

post: huffTree contains a reference to the Huffman tree.

@param symbols An array of HuffData objects

public void buildTree(HuffData[] symbols) {

Queue<BinaryTree<HuffData>> theQueue

= new PriorityQueue<>(symbols.length,

(lt, rt) -> Double.compare(lt.getData().weight,

rt.getData().weight));

// Load the queue with the leaves.

for (HuffData nextSymbol : symbols) {

BinaryTree<HuffData> aBinaryTree =

new BinaryTree<>(nextSymbol, null, null);

theQueue.offer(aBinaryTree);

}

// Build the tree.

while (theQueue.size() > 1) {

BinaryTree<HuffData> left = theQueue.poll();

BinaryTree<HuffData> right = theQueue.poll();

double wl = left.getData().weight;

double wr = right.getData().weight;

HuffData sum = new HuffData(wl + wr, '\u0000');

BinaryTree newTree =

new BinaryTree<>(sum, left, right);

theQueue.offer(newTree);

}

// The queue should now contain only one item.

huffTree = theQueue.poll();

}

The textbook Web site shows how to write method buildTree and a comparator without using the new features of Java 8.

Testing

Methods printCode and decode can be used to test the custom Huffman tree. Method printCode displays the tree, so you can examine it and verify that the Huffman tree that was built is correct based on the input data.

Method decode will decode a message that has been encoded using the code stored in the Huffman tree and displayed by printCode, so you can pass it a message string that consists of binary digits only and see whether it can be transformed back to the original symbols.

We will discuss testing the Huffman tree further in the next chapter when we continue the case study.

The printCode Method

To display the code for each alphabet symbol, we perform a preorder traversal of the final tree. The code so far is passed as a parameter along with the current node. If the current node is a leaf, as indicated by the symbol not being null, then the code is output. Otherwise the left and right subtrees are traversed. When we traverse the left subtree, we append a 0 to the code, and when we traverse the right subtree, we append a 1 to the code. Recall that at each level in the recursion, there is a new copy of the parameters and local variables.

/** Outputs the resulting code.

@param out A PrintStream to write the output to

@param code The code up to this node

@param tree The current node in the tree

private void printCode(PrintStream out, String code,

BinaryTree<HuffData> tree) {

HuffData theData = tree.getData();

if (theData.symbol != '\u0000') {

if (theData.symbol.equals(" ")) {

out.println("space: " + code);

} else {

out.println(theData.symbol + ": " + code);

}

} else {

printCode(out, code + "0", tree.getLeftSubtree());

printCode(out, code + "1", tree.getRightSubtree());

}

The decode Method

To illustrate the decode process, we will show a method that takes a String that contains a sequence of the digit characters '0' and '1' and decodes it into a message that is also a String. Method decode starts by setting currentTree to the Huffman tree. It then loops through the coded message one character at a time. If the character is a '1', then currentTree is set to the right subtree; otherwise, it is set to the left subtree. If the currentTree is now a leaf, the symbol is appended to the result and currentTree is reset to the Huffman tree (see Listing 6.11). Note that this method is for testing purposes only. In actual usage, a message would be encoded as a string of bits (not digit characters) and would be decoded one bit at a time.

LISTING 6.11

The decode Method (HuffmanTree.java)

/** Method to decode a message that is input as a string of

digit characters '0' and '1'.

@param codedMessage The input message as a String of

zeros and ones.

@return The decoded message as a String

public String decode(String codedMessage) {

StringBuilder result = new StringBuilder();

BinaryTree<HuffData> currentTree = huffTree;

for (int i = 0; i < codedMessage.length(); i++) {

if (codedMessage.charAt(i) == '1') {

currentTree = currentTree.getRightSubtree();

} else {

currentTree = currentTree.getLeftSubtree();

}

if (currentTree.isLeaf()) {

HuffData theData = currentTree.getData();

result.append(theData.symbol);

currentTree = huffTree;

}

return result.toString();

}

image PROGRAM STYLE

A Generic HuffmanTree Class

We chose to implement a nongeneric HuffmanTree class to simplify the coding. However, it may be desirable to build a Huffman tree for storing Strings (e.g., to encode words in a document instead of the individual letters) or for storing groups of pixels in an image file. A generic HuffmanTree<T> class would define a generic inner class HuffData<T> where the T is the data type of data field symbol. Each parameter type <HuffData> in our class HuffmanTree would be replaced by <HuffData<T>>, which indicates that T is a type parameter for class HuffData.

EXERCISES FOR SECTION 6.6

SELF-CHECK

What is the Huffman code for the letters a, j, k, l, s, t, and v using Figure 6.35?

Trace the execution of method printCode for the Huffman tree in Figure 6.37.

Trace the execution of method decode for the Huffman tree in Figure 6.37 and the encoded message string "11101011011001111".

Create the Huffman code tree for the following frequency table. Show the different states of the priority queue as the tree is built (see Figure 6.36).

Symbol Frequency

* 50

+ 30

- 25

/ 10

% 5

What would the Huffman code look like if all symbols in the alphabet had equal frequency?

PROGRAMMING

Write a method encode for the HuffmanTree class that encodes a String of letters that is passed as its first argument. Assume that a second argument, codes (type String[]), contains the code strings (binary digits) for the symbols (space at position 0, a at position 1, b at position 2, etc.).

Chapter Review

A tree is a recursive, nonlinear data structure that is used to represent data that is organized as a hierarchy.

A binary tree is a collection of nodes with three components: a reference to a data object, a reference to a left subtree, and a reference to a right subtree. A binary tree object has a single data field, which references the root node of the tree.

In a binary tree used to represent arithmetic expressions, the root node should store the operator that is evaluated last. All interior nodes store operators, and the leaf nodes store operands. An inorder traversal (traverse left subtree, visit root, traverse right subtree) of an expression tree yields an infix expression, a preorder traversal (visit root, traverse left subtree, traverse right subtree) yields a prefix expression, and a postorder traversal (traverse left subtree, traverse right subtree, visit root) yields a postfix expression.

Java 8 lambda expressions enable a programmer to practice functional programming in Java. A lambda expression is an anonymous method with a special shorthand notation to specify its arguments and method body. A lambda interface may be assigned to a object that instantiates a functional interface. The method body of the lambda expression will implement the single abstract method of the functional interface and will execute when applied to the functional object. Java 8 provides a set of functional interfaces in library java.util.function. A lambda expression may also be passed to a method with a parameter that is a function object.

A binary search tree is a tree in which the data stored in the left subtree of every node is less than the data stored in the root node, and the data stored in the right subtree of every node is greater than the data stored in the root node. The performance depends on the fullness of the tree and can range from O(n) (for trees that resemble linked lists) to O(log n) if the tree is full. An inorder traversal visits the nodes in increasing order.

A heap is a complete binary tree in which the data in each node is less than the data in both its subtrees. A heap can be implemented very effectively as an array. The children of the node at subscript p are at subscripts 2p + 1 and 2p + 2. The parent of child c is at (c – 1) / 2. The item at the top of a heap is the smallest item.

Insertion and removal in a heap are both O(log n). For this reason, a heap can be used to efficiently implement a priority queue. A priority queue is a data structure in which the item with the highest priority (indicated by the smallest value) is removed next. The item with the highest priority is at the top of a heap and is always removed next.

A Huffman tree is a binary tree used to store a code that facilitates file compression. The length of the bit string corresponding to a symbol in the file is inversely proportional to its frequency, so the symbol with the highest frequency of occurrence has the shortest length. In building a Huffman tree, a priority queue is used to store the symbols and trees formed so far. Each step in building the Huffman tree consists of removing two items and forming a new tree with these two items as the left and right subtrees of the new tree's root node. A reference to each new tree is inserted in the priority queue.

Java API Interfaces and Classes Introduced in This Chapter

java.text.DecimalFormat java.util.PriorityQueue

java.util.Comparator java.util.TreeSet

java.util.function.BiConsumer java.util.function.BinaryOperator

java.util.function.Consumer java.util.function.Function

java.util.function.IntPredicate

User-Defined Interfaces and Classes in This Chapter

BinarySearchTree IndexGenerator

BinaryTree KWPriorityQueue

CompareHuffmanTrees Node

ComparePrintDocuments PrintDocument

HuffData PriorityQueue

HuffmanTree SearchTree

Quick-Check Exercises

For the following expression tree

image

Is the tree full? _____ Is the tree complete? _____

List the order in which the nodes would be visited in a preorder traversal.

List the order in which the nodes would be visited in an inorder traversal.

List the order in which the nodes would be visited in a postorder traversal.

Searching a full binary search tree is O(____).

A heap is a binary tree that is a (full / complete) tree.

Write a lambda expression that can be used as a comparator that compares two objects by weight Assume there is method getWeight() that returns a double value.

Show the binary search tree that would result from inserting the items 35, 20, 30, 50, 45, 60, 18, 25 in this sequence.

Show the binary search tree in Exercise 4 after 35 is removed.

Show the heap that would result from inserting the items from Exercise 4 in the order given.

Draw the heap from Exercise 6 as an array.

Show the heap in Exercise 7 after 18 is removed.

In a Huffman tree, the item with the highest frequency of occurrence will have the _____ code.

List the code for each symbol shown in the following Huffman tree.

image

Review Questions

Draw the tree that would be formed by inserting the words in this question into a binary search tree. Use lowercase letters.

Show all three traversals of this tree.

Show the tree from Question 1 after removing draw, by, and letters in that order.

Answer Question 1, but store the words in a heap instead of a binary search tree.

Write a lambda expression that can be used as a predicate that returns true if object's color is red. Assume there is a method getColor that returns the color as a string.

Given the following frequency table, construct a Huffman code tree. Show the initial priority queue and all changes in its state as the tree is constructed.

Symbol Frequency

x 34

y 28

w 20

a 10

b 8

c 5

Programming Projects

Assume that a class ExpressionTree has a data field that is a BinaryTree. Write an instance method to evaluate an expression stored in a binary tree whose nodes contain either integer values (stored in Integer objects) or operators (stored in Character objects). Your method should implement the following algorithm.

Algorithm to Evaluate an Expression Tree

1. if the root node is an Integer object

2. Return the integer value.

3. else if the root node is a Character object

4. Let leftVal be the value obtained by recursively applying this algorithm to the left subtree.

5. Let rightVal be the value obtained by recursively applying this algorithm to the right subtree.

6. Return the value obtained by applying the operator in the root node to leftVal and rightVal.

Use method readBinaryTree to read the expression tree.

Write an application to test the HuffmanTree class. Your application will need to read a text file and build a frequency table for the characters occurring in that file. Once that table is built, create a Huffman code tree and then a string consisting of '0' and '1' digit characters that represents the code string for that file. Read that string back in and re-create the contents of the original file.

Solve Programming Project 4 in Chapter 4, “Queues,” using the class PriorityQueue.

Build a generic HuffmanTree<T> class such that the symbol type T is specified when the tree is created. Test this class by using it to encode the words in your favorite nursery rhyme.

Write clone, size, and height methods for the BinaryTree class.

In a breadth-first traversal of a binary tree, the nodes are visited in an order prescribed by their level. First visit the node at level 1, the root node. Then visit the nodes at level 2, in left-to-right order, and so on. You can use a queue to implement a breadth-first traversal of a binary tree.

Algorithm for Breadth-First Traversal of a Binary Tree

1. Insert the root node in the queue.

2. while the queue is not empty

3. Remove a node from the queue and visit it.

4. Place references to its left and right subtrees in the queue.

Code this algorithm and test it on several binary trees.

Define an IndexTree class such that each node has data fields to store a word, the count of occurrences of that word in a document file, and the line number for each occurrence. Use an ArrayList to store the line numbers. Use an IndexTree object to store an index of words appearing in a text file, and then display the index by performing an inorder traversal of this tree.

Extend the BinaryTreeClass to implement the Iterable interface by providing an iterator. The iterator should access the tree elements using an inorder traversal. The iterator is implemented as a nested private class. (Note: Unlike Node, this class should not be static.)

Design hints:

You will need a stack to hold the path from the current node back to the root. You will also need a reference to the current node (current) and a variable that stores the last item returned.

To initialize current, the constructor should start at the root and follow the left links until a node is reached that does not have a left child. This node is the initial current node.

The remove method can throw an UnsupportedOperationException. The next method should use the following algorithm:

1. Save the contents of the current node.

2. If the current node has a right child

3. push the current node onto the stack

4. set the current node to the right child

5. while the current node has a left child

6. push the current node onto the stack

7. set the current node to the left child

8. else the current node does not have a right child

9. while the stack is not empty and

the top node of the stack's right child is equal to the current node

10. set the current node to the top of the stack and pop the stack

11. if the stack is empty

12. set the current node to null indicating that iteration is complete

13. else

14. set the current node to the top of the stack and pop the stack

15.. return the saved contents of the initial current node

The Morse code (see Table 6.11) is a common code that is used to encode messages consisting of letters and digits. Each letter consists of a series of dots and dashes; for example, the code for the letter a is •− and the code for the letter b is −•••. Store each letter of the alphabet in a node of a binary tree of level 5. The root node is at level 1 and stores no letter. The left node at level 2 stores the letter e (code is •), and the right node stores the letter t (code is −). The four nodes at level 3 store the letters with codes (••, •−, −•, −−). To build the tree (see Figure 6.38), read a file in which each line consists of a letter followed by its code. The letters should be ordered by tree level. To find the position for a letter in the tree, scan the code and branch left for a dot and branch right for a dash. Encode a message by replacing each letter by its code symbol. Then decode the message using the Morse code tree. Make sure you use a delimiter symbol between coded letters.

TABLE 6.11 Morse Code for Letters

a •– b –••• c −•−• d −•• e • f ••–•

g – –• h •••• i •• j •– – – k –•– l •–••

m – – n –• o – – – p •– –• q – –•– r •–•

s ••• t – u ••– v •••– w •– – x –••–

y –•– – z – –••

image

FIGURE 6.38 Morse Code Tree

Create an abstract class Heap that has two subclasses, MinHeap and MaxHeap. Each subclass should have two constructors, one that takes no parameters and the other that takes a Comparator object. In the abstract class, the compare method should be abstract, and each subclass should define its own compare method to ensure that the ordering of elements corresponds to that required by the heap. For a MinHeap, the key in each node should be greater than the key of its parent; the ordering is reversed for a MaxHeap.

A right-threaded tree is a binary search tree in which each right link that would normally be null is a “thread” that links that node to its inorder successor. The thread enables nonrecursive algorithms to be written for search and inorder traversals that are more efficient than recursive ones. Implement a RightThreadTree class as an extension of a BinarySearchTree. You will also need an RTNode that extends the Node class to include a flag that indicates whether a node's right link is a real link or a thread.

Answers to Quick-Check Exercises

a. Not full, complete

b. + * – a b c / d e

c. a – b * c + d / e

d. a b – c * d e / +

O(log n)

A heap is a binary tree that is a complete tree.

(o1, o2) -> Double.compare(o1.getWeight(), 02.getWeight())

image

8.18, 25, 20, 35, 45, 60, 30, 50, where 18 is at position 0 and 50 is at position 7.

image

In a Huffman tree, the item with the highest frequency of occurrence will have the shortest code.

Symbol Code Symbol Code

Space 01 n 1110

a 000 o 1111

e 101 r 1001

h 1000 s 1100

i 1101 t 001