A Look at the Tree Data Structure in C++

Trees can solve a wide variety of computer science problems

Published in

Better Programming

5 min readApr 12, 2021

large tree with many branches in a forest clearing — Photo by veeterzy on Unsplash

Trees are one of the most interesting data structures in computer science. Apart from having common names such as sibling, leaves, parent, child, etc., to represent them, trees can be used to solve some of the most challenging problems in computer science.

Let’s dive into the basic concept of trees, their usage in C++, and their application to some of the most important problems.

The Concept Behind Trees

Trees belong to the class of non-linear data structures. Unlike linear data structures, such as stacks or lists, where elements are stored in linear order, non-linear data structures store elements hierarchically. Thus the traversal of a tree or graph cannot be completed in a single run.

A typical example of a tree is shown below.

diagram of numbered circles connected in a tree structure — Example of a binary tree

The elements inside the circles are called the nodes of the tree. A binary tree is a data structure where every node has at most two children. The topmost node is called the root node. A node with no children is called a leaf. Here the root node is at 27. The node at 14 is the left child of node 27, and the node at 35 is the right child. Similarly, the left and right children of the parent node 14 are 10 and 19 respectively. The leaves in the above example are 10, 19, 31, and 42. The height of the tree is defined as the longest path from the root to a leaf. In the above case, the height of the tree is 2.

The advantage of trees is that memory is utilized in an efficient way. In the case of a stack, deleting a middle element is difficult since we have to shift the elements to the right of the stack. But in the case of a tree, we can delete the node x and modify its parent node to point to children of x.

Types of Trees

Full binary tree: A binary tree is a full binary tree if every node has 0 or 2 children.

Complete binary tree: A binary tree is a complete binary tree if all the levels are completely filled, except possibly the last level, and the last level has all keys as left as possible.

Perfect binary tree: A binary tree is a perfect binary tree if all the internal nodes have two children and all leaf nodes are at the same level.

Balanced binary tree: A binary tree is balanced if the height of the tree is O(Log n) where n is the number of nodes.

Binary search tree: In a binary search tree, a node’s left child must have a value less than its parent’s value and the node’s right child must have a value greater than its parent’s value.

Coding Trees

The code to represent a tree in C++ is what is shown below. It consists of a data part and references to the left and right nodes, respectively.

struct node {
   int data;   
   struct node *left; //Reference to left child
   struct node *right; //Reference to right child
};

To keep things simple, let’s consider a binary search tree (BST). The basic operations that can be performed on this tree data structure are insert, search, and traversal.

Insert an element into BST

In the example below, we insert an element with value val into the tree. If val is less than the root value, the tree traversal continues to the left since val needs to be inserted in the left subtree. If val is greater than the root value, we continue with a right subtree traversal to insert the val to the right. If val is equal to the root key, we simply return the root to avoid duplicates. If the tree is empty or the tree traversal reaches the end with no result, then we call getNewNode function to get a new node.

Tree traversal and search for an element in a BST

To search for a key in a binary search tree, we check if the root value is equal to NULL. This case happens if the tree is empty or if the tree traversal reaches the end with the key not found. If the key is not present, we return NULL.

If the root value is equal to the key value, we return the root address.

In all other cases, we perform a left tree traversal if the key value is less than the root value and a right tree traversal if the key value is greater than the root value.

Applications of Trees

Trees can be used to represent hierarchies and structural relationships in data. Also, they are efficient when it comes to insert and search with a time complexity of O(h), where h is the height of the tree. The subtrees can be moved around with minimum effort, unlike with several other data structures, like lists and arrays.

Operating systems

One of the most popular operating systems, Linux, makes use of a tree structure to represent its hierarchical file system. The file system in Linux is arranged in a tree as shown below.

diagram showing tree structure in Linux — File system in Linux

Compilers

A syntax tree is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code.

A compiler constructs a syntax tree during the syntax analysis phase of the compiler. This tree is used to check if the correct syntax is followed when coding a computer program.

Bridges and routers

A minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. This kind of tree is used for packet routing in bridges and routers in computer networks.

Trie

A trie is an efficient information retrieval data structure. Using a trie, search complexities can be brought to optimal limit (key length). If we store keys in a binary search tree, we need time proportional to M * log N, where M is maximum string length and N is the number of keys in the tree. Using a trie, we can search the key in O(M) time.

Conclusion

B-trees, also called self-balancing trees, are used for indexing in databases for efficient information retrieval.

There are several kinds of trees in computer science which can be used for multiple purposes. Trees serve as one of the most important tools that every developer needs to know. This makes problem-solving faster and easier, especially for problems dealing with hierarchy.