AVL Tree Rotations Explained

I wrote these notes while working on an assignment for Advanced Programming 3, while trying to understand how AVL trees work. At the moment, I only explain how inserts work since we’re not expected to handle deletes in the assignment, but the rotations are the same. I’ll edit later to add a clearer explanation of how to handle deletions.

An AVL tree is considered balanced if the balance factor (difference in height between the left subtree and the right subtree) of every node is -1, 0 or 1. An insertion or a deletion can change the height of a subtree, so after each such operation, an AVL tree must be checked for imbalance, and any imbalance must be corrected as part of that operation. Since this check is made after every such operation and any imbalance is corrected immediately, we only need to check for the balance factor of a node changing to 2 or -2.

Insertion

Not all insertions cause imbalance; for example:

- Before -

 B
  \
   C

C's balance factor is 0; an empty tree's height is defined to be -1, and (-1) - (-1) = 0.
B's balance factor is (-1) - 0 = -1, which is considered balanced.

- A Inserted -

   B
  / \
 A   C

A and C both have BF (-1) - (-1) = 0.
B has BF 0 - 0 = 0

But some insertions do; for example:

- Before -

 A
  \
   B

B's balance factor is 0; (-1) - (-1). A's is (-1) - 0 = -1.

- C Inserted -

 A
  \
   B
    \
     C

C's balance factor is (-1) - (-1) = 0
B's is (-1) - 0 = -1, so it's still balanced.
A's is (-1) - 1 = -2, so it's unbalanced, which means we need to rotate the tree.

The aim of the rotation is to rearrange the unbalanced subtree so that it’s balanced, while still being a valid BST. In the unbalanced example above, an arrangement that would work would be:

   B
  / \
 A   C

B’s parent after the rotation is whatever A’s was before it (in the example, it happens to be NULL). A has gone from being the parent of B to being its left child.

When and How to Rotate

One confusing thing about AVL trees is that sometimes you’ll see the balance factor defined one way around (height of right subtree minus height of left subtree) and sometimes you’ll see it defined the other way, which of course means all the rotations are ‘backwards’ if you get the two directions confused. For the avoidance of doubt, I’m defining the balance factor as the height of the right subtree minus the height of the left subtree, so here:

 A
  \
   B
    \
     C

… node A has a balance factor of 2 and is right heavy. A left heavy tree will have a negative balance factor. I find it easier to work this way around because there’s something natural about left heavy trees having a negative balance factor and right heavy trees having a positive balance factor.

A right heavy tree like this one can be corrected using a left rotation.

After an insert, we will always have just added a leaf node. Its balance factor can initially be set to zero since all leaf nodes are balanced since their subtrees are empty and have the same height (-1) by definition. After the node is inserted, we ‘climb’ towards the root through the parent pointers and update the balance factor for each node we encounter.

For the first step up, to the new node’s parent, all we need to do is check whether our new node was a left or a right child to figure out how to adjust the balance factor; if it was a left child, we subtract 1, and if it was a right child, we add one. The possible results are:

-2 Tree is left-heavy below this node
-1 Tree is still balanced below this node
0 Again, still balanced
1 Also still balanced
2 Tree is right-heavy below this node

If the tree is still balanced, we can proceed up to the parent and update the balance factor for the new node’s grandparent, based on whether we’re ascending from the parent’s left or right child, and keep repeating until we find either a node that is now the root of an unbalanced subtree, or until we reach the tree’s root.

If we get to the root of the tree, the tree as a whole is still balanced and there is no need to do any rotating this time. If we get to a node that is unbalanced, we can perform the rotation and stop climbing towards the root because whatever rotation we do, we will reduce the height of the subtree by one, so the balance factors above that point will not need to be updated.

But which rotation will we perform? Where n is the node we just ascended to, and c is the child node we climbed to it from:

    if (n->bf == 2)                     // right-heavy
        if (n->right->bf == -1)         // left-heavy*
            // double left rotation
        else
            // single left rotation
    else if (n->bf == -2)               // left-heavy
        if (n->left->bf == 1)           // right-heavy*
            // double right rotation
        else
            // single right rotation

        // *: still balanced, but erring to that side

So how do these rotations work?

Rotations

There are four types of rotation to consider; left, right, double left and double right (terminology does vary, but I’ll use these terms only to avoid confusion).

Left Rotation

If you identify an unbalanced node whose subtree is right heavy (i.e. the balance factor is 2) and the right subtree is not left heavy (i.e. the root of the unbalanced subtree’s right child has a balance factor of 0 or 1), this is the type of rotation you need. This is how the tree looks under these circumstances:

 A
  \
   B
    \
     C

The steps to follow are:

  • B becomes the root
  • A takes ownership of B’s left child as its right child (in this case, this is NULL)
  • B takes ownership of A as its left child

This leaves the tree like this:

   B
  / \
 A   C

Right Rotation

This is the mirror image of the left rotation, and is done when the unbalanced node you identify has a left heavy subtree, indicated by a balance factor of -2, and the left subtree is not right heavy (i.e. the root of the unbalanced subtree’s left child has a balance factor of 0 or -1).

     C
    /
   B
  /
 A

The steps:

  • B becomes the new root
  • C takes ownership of B’s right child, as its left child.
  • B takes ownership of C, as its right child.

This leaves the tree looking like this:

   B
  / \
 A   C

Double Left Rotation

On a right-heavy unbalanced subtree with a left-heavy right child, the tree looks like this:

 A
  \
   C
  /
 B

After a single left rotation, we have:

   C
  /
 A
  \
   B

This doesn’t actually appear to have improved matters – the tree is the same shape as (well, the mirror image of) the unbalanced subtree we started with. The fix is to do a right rotation on the right subtree (that is, what was a right subtree to start with). Looking at just the right subtree:

   C
  /
 B

After right rotation of this subtree:

 B
  \
   C

So now the whole tree looks like:

 A
  \
   B
    \
     C

This is recogniseable as the left rotation case; a left rotation will leave it balanced.

Double Right Rotation

This is the mirror image of the double left case. Here, we have a left-heavy subtree whose left child has a right-heavy subtree:

   C
  /
 A
  \
   B

The balance factor of the root node is -2. Just like the single left rotation in the previous case, a single right rotation gives us the mirror image, still just as unbalanced and with no progress made. The fix is to do a left rotation on the left subtree first, then a right rotation on the result.

The left subtree is:

 A
  \
   B

After the rotation, this looks like:

   B
  /
 A

In the context of the whole unbalanced subtree, after the left rotation on the left subtree, this looks like:

     C
    /
   B
  /
 A

Which we know we can fix with a right rotation based on one of the earlier examples:

   B
  / \
 A   C

Example Code

Here’s some basic C example code, which currently comes with a health warning since it’s untested and incomplete. I’m planning to revisit it as time allows, but this is the best place for it in the meantime:

    struct avl_node {
        avl_node *left, *right;
        char *data;
        int height;
    };

    avl_node *search(char *value, avl_node *node) {
        int r;

        return !leaf ? NULL :
            r = strcmp(value, node->data)
                ? search(
                    value, r < 0
                        ? node->left
                        : node->right
                ) : node;
    }

    int height(avl_node *node) {
        return node ? node->height : -1;
    }

    avl_node *rotate_l(avl_node *n) {
        avl_node *on = n->left;
        n->left = on->right;
        on->right = n;
        n->height = MAX(height(n->left), height(n->right)) + 1;
        on->height = MAX(height(on->left), n->height) + 1;
        return on;
    }

    avl_node *rotate_r(avl_node *n) {
        avl_node *on = n->right;
        n->right = on->left;
        on->left = n;
        n->height = MAX(height(n->left), height(n->right)) + 1;
        on->height = MAX(height(on->right), n->height) + 1;
        return on;
    }

    avl_node *rotate_ll(avl_node *n) {
        n->left = rotate_r(n->left);
        return rotate_l(n);
    }

    avl_node *rotate_rr(avl_node *n) {
        n->right = rotate_l(n->right);
        return rotate_r(n);
    }

    avl_node *insert(char *value, avl_node *n) {
        if (!n) {
            (n = malloc(sizeof(avl_node))) || return NULL;
            memset(n, 0, sizeof(avl_node));
            (n->data = strdup(value)) && return n;
            free(n->data);
            return NULL;
        } else if (strcmp(value, n->data) < 0) {
            n->left = insert(value, n->left);
            if (height(n->left) - height(n->right) == 2)
                n = strcmp(value, n->left->data) < 0
                    ? rotate_l(n)
                    : rotate_ll(n);
        } else {
            n->right = insert(value, n->right);
            if (height(n->right) - height(n->left) == 2)
                n = strcmp(value, n->right->data) > 0
                    ? rotate_r(n)
                    : rotate_rr(n);
        }
        n->height = MAX(height(n->left), height(n->right)) + 1;
        return n;
    }
 

Let me know if you see anything stupid in there! Thanks.