Redblack tree  

Type  tree  
Invented  1972  
Invented by  Rudolf Bayer  
Time complexity in big O notation  

A redblack tree is a kind of selfbalancing binary search tree in computer science. Each node of the binary tree has an extra bit, and that bit is often interpreted as the color (red or black) of the node. These color bits are used to ensure the tree remains approximately balanced during insertions and deletions.^{[2]}
Balance is preserved by painting each node of the tree with one of two colors in a way that satisfies certain properties, which collectively constrain how unbalanced the tree can become in the worst case. When the tree is modified, the new tree is subsequently rearranged and repainted to restore the coloring properties. The properties are designed in such a way that this rearranging and recoloring can be performed efficiently.
The balancing of the tree is not perfect, but it is good enough to allow it to guarantee searching in O(log n) time, where n is the total number of elements in the tree. The insertion and deletion operations, along with the tree rearrangement and recoloring, are also performed in O(log n) time.^{[3]}
Tracking the color of each node requires only 1 bit of information per node because there are only two colors. The tree does not contain any other data specific to its being a redblack tree so its memory footprint is almost identical to a classic (uncolored) binary search tree. In many cases, the additional bit of information can be stored at no additional memory cost.
In 1972, Rudolf Bayer^{[4]} invented a data structure that was a special order4 case of a Btree. These trees maintained all paths from root to leaf with the same number of nodes, creating perfectly balanced trees. However, they were not binary search trees. Bayer called them a "symmetric binary Btree" in his paper and later they became popular as 234 trees or just 24 trees.^{[5]}
In a 1978 paper, "A Dichromatic Framework for Balanced Trees",^{[6]}Leonidas J. Guibas and Robert Sedgewick derived the redblack tree from the symmetric binary Btree.^{[7]} The color "red" was chosen because it was the bestlooking color produced by the color laser printer available to the authors while working at Xerox PARC.^{[8]} Another response from Guibas states that it was because of the red and black pens available to them to draw the trees.^{[9]}
In 1993, Arne Andersson introduced the idea of right leaning tree to simplify insert and delete operations.^{[10]}
In 1999, Chris Okasaki showed how to make the insert operation purely functional. Its balance function needed to take care of only 4 unbalanced cases and one default balanced case.^{[11]}
The original algorithm used 8 unbalanced cases, but Cormen et al. (2001) reduced that to 6 unbalanced cases.^{[2]} Sedgewick showed that the insert operation can be implemented in just 46 lines of Java code.^{[12]}^{[13]} In 2008, Sedgewick proposed the leftleaning redblack tree, leveraging Andersson's idea that simplified algorithms. Sedgewick originally allowed nodes whose two children are red making his trees more like 234 trees but later this restriction was added making new trees more like 23 trees. Sedgewick implemented the insert algorithm in just 33 lines, significantly shortening his original 46 lines of code.^{[14]}^{[15]}
A redblack tree is a special type of binary tree, used in computer science to organize pieces of comparable data, such as text fragments or numbers.
The leaf nodes of redblack trees do not contain data. These leaves need not be explicit in computer memorya null child pointer can encode the fact that this child is a leafbut it simplifies some algorithms for operating on redblack trees if the leaves really are explicit nodes. To save execution time, sometimes a pointer to a single sentinel node (instead of a null pointer) performs the role of all leaf nodes; all references from internal nodes to leaf nodes then point to the sentinel node.
Redblack trees, like all binary search trees, allow efficient inorder traversal (that is: in the order LeftRootRight) of their elements. The searchtime results from the traversal from root to leaf, and therefore a balanced tree of n nodes, having the least possible tree height, results in O(log n) search time.
In addition to the requirements imposed on a binary search tree the following must be satisfied by a
Some definitions: the number of black nodes from the root to a node is the node's black depth; the uniform number of black nodes in all paths from root to the leaves is called the blackheight of the redblack tree.^{[17]}
These constraints enforce a critical property of redblack trees: the path from the root to the farthest leaf is no more than twice as long as the path from the root to the nearest leaf. The result is that the tree is roughly heightbalanced. Since operations such as inserting, deleting, and finding values require worstcase time proportional to the height of the tree, this theoretical upper bound on the height allows redblack trees to be efficient in the worst case, unlike ordinary binary search trees.
To see why this is guaranteed, it suffices to consider the effect of properties 4 and 5 together. For a redblack tree T, let B be the number of black nodes in property 5. Let the shortest possible path from the root of T to any leaf consist of B black nodes. Longer possible paths may be constructed by inserting red nodes. However, property 4 makes it impossible to insert more than one consecutive red node. Therefore, ignoring any black NIL leaves, the longest possible path consists of 2*B nodes, alternating black and red (this is the worst case). Counting the black NIL leaves, the longest possible path consists of 2*B1 nodes.
The shortest possible path has all black nodes, and the longest possible path alternates between red and black nodes. Since all maximal paths have the same number of black nodes, by property 5, this shows that no path is more than twice as long as any other path.
A redblack tree is similar in structure to a Btree of order^{[note 1]} 4, where each node can contain between 1 and 3 values and (accordingly) between 2 and 4 child pointers. In such a Btree, each node will contain only one value matching the value in a black node of the redblack tree, with an optional value before and/or after it in the same node, both matching an equivalent red node of the redblack tree.
One way to see this equivalence is to "move up" the red nodes in a graphical representation of the redblack tree, so that they align horizontally with their parent black node, by creating together a horizontal cluster. In the Btree, or in the modified graphical representation of the redblack tree, all leaf nodes are at the same depth.
The redblack tree is then structurally equivalent to a Btree of order 4, with a minimum fill factor of 33% of values per cluster with a maximum capacity of 3 values.
This Btree type is still more general than a redblack tree though, as it allows ambiguity in a redblack tree conversionmultiple redblack trees can be produced from an equivalent Btree of order 4. If a Btree cluster contains only 1 value, it is the minimum, black, and has two child pointers. If a cluster contains 3 values, then the central value will be black and each value stored on its sides will be red. If the cluster contains two values, however, either one can become the black node in the redblack tree (and the other one will be red).
So the order4 Btree does not maintain which of the values contained in each cluster is the root black tree for the whole cluster and the parent of the other values in the same cluster. Despite this, the operations on redblack trees are more economical in time because you don't have to maintain the vector of values.^{[18]} It may be costly if values are stored directly in each node rather than being stored by reference. Btree nodes, however, are more economical in space because you don't need to store the color attribute for each node. Instead, you have to know which slot in the cluster vector is used. If values are stored by reference, e.g. objects, null references can be used and so the cluster can be represented by a vector containing 3 slots for value pointers plus 4 slots for child references in the tree. In that case, the Btree can be more compact in memory, improving data locality.
The same analogy can be made with Btrees with larger orders that can be structurally equivalent to a colored binary tree: you just need more colors. Suppose that you add blue, then the blueredblack tree defined like redblack trees but with the additional constraint that no two successive nodes in the hierarchy will be blue and all blue nodes will be children of a red node, then it becomes equivalent to a Btree whose clusters will have at most 7 values in the following colors: blue, red, blue, black, blue, red, blue (For each cluster, there will be at most 1 black node, 2 red nodes, and 4 blue nodes).
For moderate volumes of values, insertions and deletions in a colored binary tree are faster compared to Btrees because colored trees don't attempt to maximize the fill factor of each horizontal cluster of nodes (only the minimum fill factor is guaranteed in colored binary trees, limiting the number of splits or junctions of clusters). Btrees will be faster for performing rotations (because rotations will frequently occur within the same cluster rather than with multiple separate nodes in a colored binary tree). For storing large volumes, however, Btrees will be much faster as they will be more compact by grouping several children in the same cluster where they can be accessed locally.
All optimizations possible in Btrees to increase the average fill factors of clusters are possible in the equivalent multicolored binary tree. Notably, maximizing the average fill factor in a structurally equivalent Btree is the same as reducing the total height of the multicolored tree, by increasing the number of nonblack nodes. The worst case occurs when all nodes in a colored binary tree are black, the best case occurs when only a third of them are black (and the other two thirds are red nodes).
Redblack trees offer worstcase guarantees for insertion time, deletion time, and search time. Not only does this make them valuable in timesensitive applications such as realtime applications, but it makes them valuable building blocks in other data structures which provide worstcase guarantees; for example, many data structures used in computational geometry can be based on redblack trees, and the Completely Fair Scheduler used in current Linux kernels and epoll system call implementation^{[19]} uses redblack trees.
The AVL tree is another structure supporting O(log n) search, insertion, and removal. AVL trees can be colored redblack, thus are a subset of RB trees. Worstcase height is 0.720 times the worstcase height of RB trees, so AVL trees are more rigidly balanced. The performance measurements of Ben Pfaff with realistic test cases in 79 runs find AVL to RB ratios between 0.677 and 1.077, median at 0.947, and geometric mean 0.910.^{[20]} Kind of in between are the WAVL trees.
Redblack trees are also particularly valuable in functional programming, where they are one of the most common persistent data structures, used to construct associative arrays and sets which can retain previous versions after mutations. The persistent version of redblack trees requires O(log n) space for each insertion or deletion, in addition to time.
For every 24 tree, there are corresponding redblack trees with data elements in the same order. The insertion and deletion operations on 24 trees are also equivalent to colorflipping and rotations in redblack trees. This makes 24 trees an important tool for understanding the logic behind redblack trees, and this is why many introductory algorithm texts introduce 24 trees just before redblack trees, even though 24 trees are not often used in practice.
In 2008, Sedgewick introduced a simpler version of the redblack tree called the leftleaning redblack tree^{[21]} by eliminating a previously unspecified degree of freedom in the implementation. The LLRB maintains an additional invariant that all red links must lean left except during inserts and deletes. Redblack trees can be made isometric to either 23 trees,^{[22]} or 24 trees,^{[21]} for any sequence of operations. The 24 tree isometry was described in 1978 by Sedgewick.^{[This quote needs a citation]} With 24 trees, the isometry is resolved by a "color flip," corresponding to a split, in which the red color of two children nodes leaves the children and moves to the parent node.
The original description of the tango tree, a type of tree optimized for fast searches, specifically uses redblack trees as part of its data structure.^{[23]}
In the version 8 of Java, the Collection HashMap has been modified such that instead of using a LinkedList to store different elements with colliding hashcodes, a RedBlack tree is used. This results in the improvement of time complexity of searching such an element from O(n) to O(log n).^{[24]}
Readonly operations on a redblack tree require no modification from those used for binary search trees, because every redblack tree is a special case of a simple binary search tree. However, the immediate result of an insertion or removal may violate the properties of a redblack tree. Restoring the redblack properties requires a small number (O(log n) or amortized O(1)) of color changes (which are very quick in practice) and no more than three tree rotations (two for insertion). Although insert and delete operations are complicated, their times remain O(log n).
If the example implementation below is not suitable, there are a couple other implementations with explanations found in Ben Pfaff's annotated C library GNU libavl (currently v2.0.2) and Eternally Confuzzled's tutorial on redblack trees.
The details of the insert and removal operations will be demonstrated with example C code. The example code may call upon the helper functions below to find the parent, sibling, uncle and grandparent nodes and to rotate a node left or right:
struct node* parent(struct node* n) {
return n>parent; // NULL for root node
}
struct node* grandparent(struct node* n) {
struct node* p = parent(n);
if (p == NULL)
return NULL; // No parent means no grandparent
return parent(p); // NULL if parent is root
}
struct node* sibling(struct node* n) {
struct node* p = parent(n);
if (p == NULL)
return NULL; // No parent means no sibling
if (n == p>left)
return p>right;
else
return p>left;
}
struct node* uncle(struct node* n) {
struct node* p = parent(n);
struct node* g = grandparent(n);
if (g == NULL)
return NULL; // No grandparent means no uncle
return sibling(p);
}
void rotate_left(struct node* n) {
struct node* nnew = n>right;
struct node* p = parent(n);
assert(nnew != LEAF); // since the leaves of a redblack tree are empty, they cannot become internal nodes
n>right = nnew>left;
nnew>left = n;
n>parent = nnew;
// handle other child/parent pointers
if (n>right != NULL)
n>right>parent = n;
if (p != NULL) // initially n could be the root
{
if (n == p>left)
p>left = nnew;
else if (n == p>right) // if (...) is excessive
p>right = nnew;
}
nnew>parent = p;
}
void rotate_right(struct node* n) {
struct node* nnew = n>left;
struct node* p = parent(n);
assert(nnew != LEAF); // since the leaves of a redblack tree are empty, they cannot become internal nodes
n>left = nnew>right;
nnew>right = n;
n>parent = nnew;
// handle other child/parent pointers
if (n>left != NULL)
n>left>parent = n;
if (p != NULL) // initially n could be the root
{
if (n == p>left)
p>left = nnew;
else if (n == p>right) // if (...) is excessive
p>right = nnew;
}
nnew>parent = p;
}
Insertion begins by adding the node in a very similar manner as a standard binary search tree insertion and by coloring it red. The big difference is that in the binary search tree a new node is added as a leaf, whereas leaves contain no information in the redblack tree, so instead the new node replaces an existing leaf and then has two black leaves of its own added.
struct node *insert(struct node* root, struct node* n) {
// insert new node into the current tree
insert_recurse(root, n);
// repair the tree in case any of the redblack properties have been violated
insert_repair_tree(n);
// find the new root to return
root = n;
while (parent(root) != NULL)
root = parent(root);
return root;
}
void insert_recurse(struct node* root, struct node* n) {
// recursively descend the tree until a leaf is found
if (root != NULL && n>key < root>key) {
if (root>left != LEAF) {
insert_recurse(root>left, n);
return;
}
else
root>left = n;
} else if (root != NULL) {
if (root>right != LEAF){
insert_recurse(root>right, n);
return;
}
else
root>right = n;
}
// insert new node n
n>parent = root;
n>left = LEAF;
n>right = LEAF;
n>color = RED;
}
What happens next depends on the color of other nearby nodes. There are several cases of redblack tree insertion to handle:
void insert_repair_tree(struct node* n) {
if (parent(n) == NULL) {
insert_case1(n);
} else if (parent(n)>color == BLACK) {
insert_case2(n);
} else if (uncle(n)>color == RED) {
insert_case3(n);
} else {
insert_case4(n);
}
}
Note that:
Case 1: The current node N is at the root of the tree. In this case, it is repainted black to satisfy property 2 (the root is black). Since this adds one black node to every path at once, property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes) is not violated.
void insert_case1(struct node* n)
{
if (parent(n) == NULL)
n>color = BLACK;
}
Case 2: The current node's parent P is black, so property 4 (both children of every red node are black) is not invalidated. In this case, the tree is still valid. Property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes) is not threatened, because the current node N has two black leaf children, but because N is red, the paths through each of its children have the same number of black nodes as the path through the leaf it replaced, which was black, and so this property remains satisfied.
void insert_case2(struct node* n)
{
return; /* Do nothing since tree is still valid */
}
Case 3: If both the parent P and the uncle U are red, then both of them can be repainted black and the grandparent G becomes red to maintain property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes). Since any path through the parent or uncle must pass through the grandparent, the number of black nodes on these paths has not changed. However, the grandparent G may now violate Property 2 (The root is black) if it is the root or Property 4 (Both children of every red node are black) if it has a red parent. To fix this, the tree's redblack repair procedure is rerun on G. Note that this is a tailrecursive call, so it could be rewritten as a loop. Since this is the only loop, and any rotations occur after this loop, this proves that a constant number of rotations occur. 
void insert_case3(struct node* n)
{
parent(n)>color = BLACK;
uncle(n)>color = BLACK;
grandparent(n)>color = RED;
insert_repair_tree(grandparent(n));
}
Case 4, step 1: The parent P is red but the uncle U is black. The ultimate goal will be to rotate the current node into the grandparent position, but this will not work if the current node is on the "inside" of the subtree under G (i.e., if N is the left child of the right child of the grandparent or the right child of the left child of the grandparent). In this case, a left rotation on P that switches the roles of the current node N and its parent P can be performed. The rotation causes some paths (those in the subtree labelled "1") to pass through the node N where they did not before. It also causes some paths (those in the subtree labelled "3") not to pass through the node P where they did before. However, both of these nodes are red, so property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes) is not violated by the rotation. After this step has been completed, property 4 (both children of every red node are black) is still violated, but now we can resolve this by continuing to step 2. 
void insert_case4(struct node* n)
{
struct node* p = parent(n);
struct node* g = grandparent(n);
if (n == g>left>right) {
rotate_left(p);
n = n>left;
} else if (n == g>right>left) {
rotate_right(p);
n = n>right;
}
insert_case4step2(n);
}
Case 4, step 2: The current node N is now certain to be on the "outside" of the subtree under G (left of left child or right of right child). In this case, a right rotation on G is performed; the result is a tree where the former parent P is now the parent of both the current node N and the former grandparent G. G is known to be black, since its former child P could not have been red without violating property 4. Once the colors of P and G are switched, the resulting tree satisfies property 4 (both children of every red node are black). Property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes) also remains satisfied, since all paths that went through any of these three nodes went through G before, and now they all go through P. 
void insert_case4step2(struct node* n)
{
struct node* p = parent(n);
struct node* g = grandparent(n);
if (n == p>left)
rotate_right(g);
else
rotate_left(g);
p>color = BLACK;
g>color = RED;
}
Note that inserting is actually inplace, since all the calls above use tail recursion.
In the algorithm above, all cases are called only once, except in Case 3 where it can recurse back to Case 1 with the grandparent node, which is the only case where an iterative implementation will effectively loop. Because the problem of repair in that case is escalated two levels higher each time, it takes maximally iterations to repair the tree (where h is the height of the tree). Because the probability for escalation decreases exponentially with each iteration the average insertion cost is practically constant.
In a regular binary search tree when deleting a node with two nonleaf children, we find either the maximum element in its left subtree (which is the inorder predecessor) or the minimum element in its right subtree (which is the inorder successor) and move its value into the node being deleted (as shown here). We then delete the node we copied the value from, which must have fewer than two nonleaf children. (Nonleaf children, rather than all children, are specified here because unlike normal binary search trees, redblack trees can have leaf nodes anywhere,which are actually the sentinel Nil, so that all nodes are either internal nodes with two children or leaf nodes with, by definition, zero children. In effect, internal nodes having two leaf children in a redblack tree are like the leaf nodes in a regular binary search tree.) Because merely copying a value does not violate any redblack properties, this reduces to the problem of deleting a node with at most one nonleaf child. Once we have solved that problem, the solution applies equally to the case where the node we originally want to delete has at most one nonleaf child as to the case just considered where it has two nonleaf children.
Therefore, for the remainder of this discussion we address the deletion of a node with at most one nonleaf child. We use the label M to denote the node to be deleted; C will denote a selected child of M, which we will also call "its child". If M does have a nonleaf child, call that its child, C; otherwise, choose either leaf as its child, C.
If M is a red node, we simply replace it with its child C, which must be black by property 4. (This can only occur when M has two leaf children, because if the red node M had a black nonleaf child on one side but just a leaf child on the other side, then the count of black nodes on both sides would be different, thus the tree would violate property 5.) All paths through the deleted node will simply pass through one fewer red node, and both the deleted node's parent and child must be black, so property 3 (all leaves are black) and property 4 (both children of every red node are black) still hold.
Another simple case is when M is black and C is red. Simply removing a black node could break Properties 4 ("Both children of every red node are black") and 5 ("All paths from any given node to its leaf nodes contain the same number of black nodes"), but if we repaint C black, both of these properties are preserved.
The complex case is when both M and C are black. (This can only occur when deleting a black node which has two leaf children, because if the black node M had a black nonleaf child on one side but just a leaf child on the other side, then the count of black nodes on both sides would be different, thus the tree would have been an invalid redblack tree by violation of property 5.) We begin by replacing M with its child C  we recall that in this case "its child C" is either child of M, both being leaves. We will relabel this child C (in its new position) N, and its sibling (its new parent's other child) S. (S was previously the sibling of M.) In the diagrams below, we will also use P for N's new parent (M's old parent), S_{L} for S's left child, and S_{R} for S's right child (S cannot be a leaf because if M and C were black, then P's one subtree which included M counted two blackheight and thus P's other subtree which includes S must also count two blackheight, which cannot be the case if S is a leaf node).
We can perform the steps outlined above with the following code, where the function replace_node
substitutes child
into n
's place in the tree. For convenience, code in this section will assume that null leaves are represented by actual node objects rather than NULL (the code in the Insertion section works with either representation).
void replace_node(struct node* n, struct node* child){
child>parent = n>parent;
if (n == n>parent>left)
n>parent>left = child;
else
n>parent>right = child;
}
void delete_one_child(struct node* n)
{
/*
* Precondition: n has at most one nonleaf child.
*/
struct node* child = is_leaf(n>right) ? n>left : n>right;
replace_node(n, child);
if (n>color == BLACK) {
if (child>color == RED)
child>color = BLACK;
else
delete_case1(child);
}
free(n);
}
n
in the code above) and deleting it afterwards. We do this if the parent is black (red is trivial), so it behaves in the same way as a null leaf (and is sometimes called a 'phantom' leaf). And we can safely delete it at the end as n
will remain a leaf after all operations, as shown above. In addition, the sibling tests in cases 2 and 3 require updating as it is no longer true that the sibling will have children represented as objects.If both N and its original parent are black, then deleting this original parent causes paths which proceed through N to have one fewer black node than paths that do not. As this violates property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes), the tree must be rebalanced. There are several cases to consider:
Case 1: N is the new root. In this case, we are done. We removed one black node from every path, and the new root is black, so the properties are preserved.
void delete_case1(struct node* n)
{
if (n>parent != NULL)
delete_case2(n);
}
Case 2: S is red. In this case we reverse the colors of P and S, and then rotate left at P, turning S into N's grandparent. Note that P has to be black as it had a red child. The resulting subtree has a path short one black node so we are not done. Now N has a black sibling and a red parent, so we can proceed to step 4, 5, or 6. (Its new sibling is black because it was once the child of the red S.) In later cases, we will relabel N's new sibling as S. 
void delete_case2(struct node* n)
{
struct node* s = sibling(n);
if (s>color == RED) {
n>parent>color = RED;
s>color = BLACK;
if (n == n>parent>left)
rotate_left(n>parent);
else
rotate_right(n>parent);
}
delete_case3(n);
}
Case 3: P, S, and S's children are black. In this case, we simply repaint S red. The result is that all paths passing through S, which are precisely those paths not passing through N, have one less black node. Because deleting N's original parent made all paths passing through N have one less black node, this evens things up. However, all paths through P now have one fewer black node than paths that do not pass through P, so property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes) is still violated. To correct this, we perform the rebalancing procedure on P, starting at case 1. 
void delete_case3(struct node* n)
{
struct node* s = sibling(n);
if ((n>parent>color == BLACK) &&
(s>color == BLACK) &&
(s>left>color == BLACK) &&
(s>right>color == BLACK)) {
s>color = RED;
delete_case1(n>parent);
} else
delete_case4(n);
}
Case 4: S and S's children are black, but P is red. In this case, we simply exchange the colors of S and P. This does not affect the number of black nodes on paths going through S, but it does add one to the number of black nodes on paths going through N, making up for the deleted black node on those paths. 
void delete_case4(struct node* n)
{
struct node* s = sibling(n);
if ((n>parent>color == RED) &&
(s>color == BLACK) &&
(s>left>color == BLACK) &&
(s>right>color == BLACK)) {
s>color = RED;
n>parent>color = BLACK;
} else
delete_case5(n);
}
Case 5: S is black, S's left child is red, S's right child is black, and N is the left child of its parent. In this case we rotate right at S, so that S's left child becomes S's parent and N's new sibling. We then exchange the colors of S and its new parent. All paths still have the same number of black nodes, but now N has a black sibling whose right child is red, so we fall into case 6. Neither N nor its parent are affected by this transformation. (Again, for case 6, we relabel N's new sibling as S.) 
void delete_case5(struct node* n)
{
struct node* s = sibling(n);
if (s>color == BLACK) { /* this if statement is trivial,
due to case 2 (even though case 2 changed the sibling to a sibling's child,
the sibling's child can't be red, since no red parent can have a red child). */
/* the following statements just force the red to be on the left of the left of the parent,
or right of the right, so case six will rotate correctly. */
if ((n == n>parent>left) &&
(s>right>color == BLACK) &&
(s>left>color == RED)) { /* this last test is trivial too due to cases 24. */
s>color = RED;
s>left>color = BLACK;
rotate_right(s);
} else if ((n == n>parent>right) &&
(s>left>color == BLACK) &&
(s>right>color == RED)) {/* this last test is trivial too due to cases 24. */
s>color = RED;
s>right>color = BLACK;
rotate_left(s);
}
}
delete_case6(n);
}
Case 6: S is black, S's right child is red, and N is the left child of its parent P. In this case we rotate left at P, so that S becomes the parent of P and S's right child. We then exchange the colors of P and S, and make S's right child black. The subtree still has the same color at its root, so Properties 4 (Both children of every red node are black) and 5 (All paths from any given node to its leaf nodes contain the same number of black nodes) are not violated. However, N now has one additional black ancestor: either P has become black, or it was black and S was added as a black grandparent. Thus, the paths passing through N pass through one additional black node. Meanwhile, if a path does not go through N, then there are two possibilities:
Either way, the number of black nodes on these paths does not change. Thus, we have restored Properties 4 (Both children of every red node are black) and 5 (All paths from any given node to its leaf nodes contain the same number of black nodes). The white node in the diagram can be either red or black, but must refer to the same color both before and after the transformation. 
void delete_case6(struct node* n)
{
struct node* s = sibling(n);
s>color = n>parent>color;
n>parent>color = BLACK;
if (n == n>parent>left) {
s>right>color = BLACK;
rotate_left(n>parent);
} else {
s>left>color = BLACK;
rotate_right(n>parent);
}
}
Again, the function calls all use tail recursion, so the algorithm is inplace.
In the algorithm above, all cases are chained in order, except in delete case 3 where it can recurse to case 1 back to the parent node: this is the only case where an iterative implementation will effectively loop. No more than h loops back to case 1 will occur (where h is the height of the tree). And because the probability for escalation decreases exponentially with each iteration the average removal cost is constant.
Additionally, no tail recursion ever occurs on a child node, so the tail recursion loop can only move from a child back to its successive ancestors. If a rotation occurs in case 2 (which is the only possibility of rotation within the loop of cases 13), then the parent of the node N becomes red after the rotation and we will exit the loop. Therefore, at most one rotation will occur within this loop. Since no more than two additional rotations will occur after exiting the loop, at most three rotations occur in total.
Mehlhorn & Sanders (2008) point out: "AVL trees do not support constant amortized deletion costs", but redblack trees do.^{[25]}
A red black tree which contains n internal nodes has a height of O(log n).
Definitions:
Lemma: A subtree rooted at node v has at least internal nodes.
Proof of Lemma (by induction height):
Basis: h(v) = 0
If v has a height of zero then it must be null, therefore bh(v) = 0. So:
Inductive Step: v such that h(v) = k, has at least internal nodes implies that such that h() = k+1 has at least internal nodes.
Since has h() > 0 it is an internal node. As such it has two children each of which have a blackheight of either bh() or bh()1 (depending on whether the child is red or black, respectively). By the inductive hypothesis each child has at least internal nodes, so has at least:
internal nodes.
Using this lemma we can now show that the height of the tree is logarithmic. Since at least half of the nodes on any path from the root to a leaf are black (property 4 of a redblack tree), the blackheight of the root is at least h(root)/2. By the lemma we get:
Therefore, the height of the root is O(log n).
In addition to the singleelement insert, delete and lookup operations, several set operations have been defined on redblack trees: union, intersection and set difference. Then fast bulk operations on insertions or deletions can be implemented based on these set functions. These set operations rely on two helper operations, Split and Join. With the new operations, the implementation of redblack trees can be more efficient and highlyparallelizable.^{[26]} This implementation allows a red root.
The join algorithm is as follows:
function joinRightRB(T_{L}, k, T_{R}) if r(T_{L})=?r(T_{L})/2?×2: return Node(T_{L},?k,red?,T_{R}) else (L',?k',c'?,R')=expose(T_{L}) T'=Node(L',?k',c'?,joinRightRB(R',k,T_{R}) if (c'=black) and (T'.right.color=T'.right.right.color=red): T'.right.right.color=black; return rotateLeft(T') else return T' function joinLeftRB(T_{L}, k, T_{R}) /* symmetric to joinRightRB */ function join(T_{L}, k, T_{R}) if ?r(T_{L})/2?>?r(T_{R})/2?×2: T'=joinRightRB(T_{L},k,T_{R}) if (T'.color=red) and (T'.right.color=red): T'.color=black return T' else if ?r(T_{L})/2?>?r(T_{L})/2?×2 /* symmetric */ else if (T_{L}.color=black) and (T_{R}=black) Node(T_{L},?k,red?,T_{R}) else Node(T_{L},?k,black?,T_{R})
Here of a node means twice the black height of a black node, and the twice the black height of a red node. expose(v)=(l,?k,c?,r) means to extract a tree node 's left child , the key of the node , the color of the node and the right child . Node(l,?k,c?,r) means to create a node of left child , key , color and right child .
The split algorithm is as follows:
function split(T,k) if (T=nil) return (nil,false,nil) (L,(m,c),R)=expose(T) if (k=m) return (L,true,R) if (k<m) (L',b,R')=split(L,k) return (L',b,join(R',m,R)) if (k>m) (L',b,R')=split(R,k) return (join(L,m,L'),b,R))
The union of two redblack trees t_{1} and t_{2} representing sets A and B, is a redblack tree t that represents A ? B. The following recursive function computes this union:
function union(t_{1}, t_{2}): if t_{1} = nil: return t_{2}if t_{2} = nil: return t_{1} t_{<}, t_{>}2 on t_{1}.root return join(t_{1}.root, union(left(t_{1}), t_{<}), union(right(t_{1}), t_{>}))
Here, Split is presumed to return two trees: one holding the keys less its input key, one holding the greater keys. (The algorithm is nondestructive, but an inplace destructive version exists as well.)
The algorithm for intersection or difference is similar, but requires the Join2 helper routine that is the same as Join but without the middle key. Based on the new functions for union, intersection or difference, either one key or multiple keys can be inserted to or deleted from the redblack tree. Since Split calls Join but does not deal with the balancing criteria of redblack trees directly, such an implementation is usually called the "joinbased" implementation.
The complexity of each of union, intersection and difference is for two redblack trees of sizes and . This complexity is optimal in terms of the number of comparisons. More importantly, since the recursive calls to union, intersection or difference are independent of each other, they can be executed in parallel with a parallel depth .^{[26]} When , the joinbased implementation has the same computational directed acyclic graph (DAG) as singleelement insertion and deletion if the root of the larger tree is used to split the smaller tree.
Parallel algorithms for constructing redblack trees from sorted lists of items can run in constant time or O(log log n) time, depending on the computer model, if the number of processors available is asymptotically proportional to the number n of items where n>?. Fast search, insertion, and deletion parallel algorithms are also known.^{[27]}
The joinbased algorithms for redblack trees are parallel for bulk operations, including union, intersection, construction, filter, mapreduce, and so on.
A redblacktree was referenced correctly in an episode of Missing (Canadian TV series)^{[28]} as noted by Robert Sedgewick in one of his lectures:^{[29]}
Jess: "It was the red door again."
Pollock: "I thought the red door was the storage container."
Jess: "But it wasn't red anymore, it was black."
Antonio: "So red turning to black means what?"
Pollock: "Budget deficits, red ink, black ink."
Antonio: "It could be from a binary search tree. The redblack tree tracks every simple path from a node to a descendant leaf that has the same number of black nodes."
Jess: "Does that help you with the ladies?"
A lot of people ask why did we use the name redblack. Well, we invented this data structure, this way of looking at balanced trees, at Xerox PARC which was the home of the personal computer and many other innovations that we live with today entering[sic] graphic user interfaces, ethernet and objectoriented programmings[sic] and many other things. But one of the things that was invented there was laser printing and we were very excited to have nearby color laser printer that could print things out in color and out of the colors the red looked the best. So, that's why we picked the color red to distinguish red links, the types of links, in three nodes. So, that's an answer to the question for people that have been asking.
Our parallel algorithm for constructing a redblack tree from a sorted list of n items runs in O(1) time with n processors on the CRCW PRAM and runs in O(log log n) time with n / log log n processors on the EREW PRAM.
So not only is there some excitement in that dialogue but it's also technically correct which you don't often find with math in popular culture of computer science. A red black tree tracks every simple path from a node to a descendant leaf with the same number of black nodes they got that right.
Manage research, learning and skills at defaultlogic.com. Create an account using LinkedIn to manage and organize your omnichannel knowledge. defaultlogic.com is like a shopping cart for information  helping you to save, discuss and share.