How to create a binary tree using an array in Swift - part 1
I recently started reading a book on Algorithms and Data Structures by Marcello La Rocca that I find fascinating. I'm going to take a deep dive into chapter 2 and adapt the pseudocode to implement a priority queue in Swift. This article will look at the properties of a priority queue and how this can be implemented using an array to store a binary tree structure..
These are my notes on the book
Advanced Algorithms and Data Structures - by Marcello La Rocca
May 2021, ISBN 9781617295485
Related articles on Advanced Algorithms and Data Structures:
- How to create a binary tree using an array in Swift - part 1
- How to create a binary tree using an array in Swift - part 2
I've just started reading this book in January 2024 and I think it is fantastic. Marcello has a great way of explaining the concepts and is well able to justify mathematically why a particular algorithm is going to be more performant.
The big "Ah Ha!" (or Dah ha!) moment for me was how an array could be used to represent a binary tree. This inspired me to implement this code and try out creating such a priority queue in a binary tree, using an array for the underlying structure.
This is just for fun, I don't have any affiliation with the book and I hope I have not mis-interpreted anything. These notes are for me and I hope others find them useful.
Priority Queue
A priority queue is an abstract data type referring to a group of elements organised in order of priority. A priority queue must support a number of operations and the book lists the following 5 operations.
- Top - Remove and return the top priority element from the queue
- Insert - Add a new element to the queue
- Remove - Remove an element from the queue
- Peek - Return the top priority element from the queue, without removing the element
- Update - Update the priority of an element in the queue
The priority queue needs to remain in a valid state after each of these operations.
Finally, a priority queue needs to set how to manage priorities. The easiest mechanism is to use numbers to represent the priority of an element and to decide whether higher numbers represent a higher priority or a lower priority. The priority queues represented here will use a lower number to represent a higher priority, simply because we often use the term "number one priority" to represent our most importent item.
Top and Peek
The Top and Peek operations on a priority queue are similar in that they return the
highest priority element in the queue. The difference is in that Top
removes the
element from the queue so that the next priority element becomes the top priority
element. The Peek
operation does not alter the queue, so that calling Peek
repeatedly on a priority queue will return the same element. Calling Top
repeatedly
on a priority queue will keep removing and returning the top element until the queue
is empty.
Insert
There has to be a mechanism to add new elements to a priority queue. The priority of the element to be added is unlikely to be the lowest priority and will somehow need to be inserted into the appropriate place in the queue.
Remove
There will be times when an item somewhere in the middle of the priority queue is no longer needed. In this case, it will be necessary to remove the element from the queue. As discussed more in the book, the challenge here can be in finding an element in the priority queue. The priority queue needs to be left in a valid state when the element has been removed.
Update Priority
There may be options to update elements in a priority queue. The update could involve just updating some of the contents on an element or it could involve changing the priority of an element in the queue. Similar to removing, finding the element is first required. If the priority of the element is to be changed, then elements in the queue will need to be moved to get the queue into a valid state.
Array to represent a binary tree
I probably have not thought about storing data for a binary tree much, but if I did, I would consider using some kind of linked list of nodes. Such as a note structure that could contain the node data and three optional values; a pointer to parent node; a pointer to a left child node; and a pointer to a right child node. Then that is probably where I stop thinking about it.
The book shows that a binary tree (or and n-branch tree) can be stored in an array.
In the array representation, if we are starting to count indices from 0, the children of the i-th node are stored in the positions (2 * i) + 1 and 2 * (i + 1), while the parent of node i is at index (i - 1) / 2 (except for the root, which has no parent).
Node paths
It is easier to understand the relationship between nodes and their index in the arrays using an example. Take the node containing E at index 4 in the array and binary tree above.
Parent node of E:
$$ \begin{aligned} Index \ of node \ E &= 4 \\ Parent \ node \ index \ &= (i - 1) / 2 \\ Parent \ node \ index \ &= (4 - 1) / 2 \\ Parent \ node \ index \ &= 3 / 2 \\ Parent \ node \ index \ &= 1 \\ Parent \ node \ &= Array[1] \\ Parent \ node \ &= B \end{aligned} $$
Left child node of E:
$$ \begin{aligned} Index \ of node \ E &= 4 \\ Left \ child \ node \ index \ &= (2 * i) + 1 \\ Left \ child \ node \ index \ &= (2 * 4) + 1 \\ Left \ child \ node \ index \ &= 8 + 1 \\ Left \ child \ node \ index \ &= 9 \\ Left \ child \ node \ &= Array[9] \\ Left \ child \ node \ &= J \end{aligned} $$
Right child node of E:
$$ \begin{aligned} Index \ of node \ E &= 4 \\ Right \ child \ node \ index \ &= 2 * (i + 1) \\ Right \ child \ node \ index \ &= 2 * (4 + 1) \\ Right \ child \ node \ index \ &= 2 * 5 \\ Right \ child \ node \ index \ &= 10 \\ Right \ child \ node \ &= Array[10] \\ Right \ child \ node \ &= K \end{aligned} $$
Bubble Up method
There are two internal functions that need to be implemented on the Priority Queue
that can be used by the public operations on the queue. The first of these is called
bubbleUp
in the book. bubbleUp
takes a node and the priority queue and if the
priority of the specified node is higher than that of its parent, the nodes are
swapped. The focus moves to the new parent node and repeats the comparison.
The example in the diagram starts with the addition of a node at the end of the array with a priority of 15. The Priority Queue is using lower values to represent higher priority. So this node with a priority of 15 is higher priority than its parent (priority = 70), so these values are swapped. Next the node with priority 15 is compared with its new parent (priority = 30) and once again these nodes are swapped. Finally the node with priority 15 is compared with its new parent (priority = 10) and no swap is required as the parent has a higher priority and the Priority Queue is back in a valid state.
Push Down method
The second internal function is called pushDown
and performs more or less the
reverse of the bubbleUp
method. It is used to move nodes down towards the leaves of
the Binary Tree if their priority is lower than their chid nodes.
The example here starts with the priority of a node being set to 150, which is a lower priority than both of its child nodes. This node is swapped with the first child (priority = 60). The node is still lower priority than it s child nodes so it is swapped its child node (priority = 120). The node is now a leaf node and so no further swaps are required and the Priority Queue is back in a valid state.
Conclusion
As I said at the start, these are my notes to help me learn the use of an array to manage a priority queue as a binary tree structure. If we just wanted to manage a list of tasks in a kind of TODO app, then a simple sorted list would be fine. The book is discussing when the list of priority items goes into millions or billions. If you have a TODO list of millions - you have bigger problems than how to store this in a prioritised list!
This article has outlined the required operations of a Priority Queue. I've shown how this can be represented in a binary tree and how an Array can be used to manage the queue. I've outlined two internal functions required to implement a Priority Queue and depicted how these functions work on a binary tree.
In the next article, I'll dive into Swift code to implement the Priority Queue and then I want to do some timing tests to confirm the performance characteristics.