Part 2: The Binary Search Data Structure with XPath 3.0, Deleting a node

In the first part of this post I presented a Binary Search Tree implemented in pure XPath 3.0. Some new nice features of XPath 3.0 were shown in action and other, still missing but very important features were identified and discussed.

The only operation that was missing was the deletion of a node from the tree. I show how to do this now.

This is the function that consumers will use:

(:
Delete from $pTree the node containing $pItem.
:)

declare function tree:deleteNode
         (
$pTree as (function() as item()*)*
,
         
$pItem as item
()
         )
        
as (function() as item()*)*

{
 
tree:deleteNode($pTree, $pItem, 1)
};

It simply delegates to a similar internal function that also requires a third argument – the depth of the (sub)tree on which the delete is performed. Why is the depth necessary? The answer will come later.

(:
Delete from $pTree the node containing $pItem.
$pTree has a depth of $pDepth (1 if not a subtree)
:)

declare function tree:deleteNode
         (
$pTree as (function() as item()*)*
,
         
$pItem as item
(),
         
$
pDepth as xs:integer
         )
        
as (function() as item()*)*

{
if(tree:empty($pTree))
  
then $
pTree
  
else

     let $top   := tree:top($pTree),
        
$left  := tree:left($
pTree),
        
$right := tree:right($
pTree)
     
return

         if($pItem eq $top)
          
then tree:deleteTopNode($pTree, $
pDepth)
          
else

            if($pItem lt $top)
             
then

                (
                 
function() {tree:top($pTree)},
                 
function
()
                  {
tree:deleteNode(tree:left($
pTree),
                                  
$
pItem,
                                  
$pDepth+1
)
                  },
                 
function() {tree:right($
pTree)}
                 )
             
else

                (
                 
function() {tree:top($pTree)},
                 
function() {tree:left($
pTree)},
                 
function
()
                  {
tree:deleteNode(tree:right($
pTree),
                                             
$
pItem,
                                             
$pDepth+1
)
                  }
                 )
};

We see that the deleteNode() function simply dispatches the operation to the left or right subtree and calls another function – deleteTopNode() – whenever the item to be deleted happens to be the value of the top node.

declare function tree:deleteTopNode
         (
$pTree as (function() as item()*)*
,
         
$
pDepth as xs:integer
         )
        
as (function() as item()*)*

{
 
let $left := tree:left($pTree),
     
$right := tree:right($
pTree)
   
return

      if(tree:empty($left) and tree:empty($right))
       
then tree:tree
()
       
else

         if(tree:empty($left) and not(tree:empty($right)))
          
then

             (
              
function() {tree:top($right)},
              
function() {tree:left($
right)},
              
function() {tree:right($
right)},
              
function() {tree:size($right) 1
}
              )
          
else

            if(not(tree:empty($left)) and tree:empty($right))
            
then

             (
              
function() {tree:top($left)},
              
function() {tree:left($
left)},
              
function() {tree:right($
left)},
              
function() {tree:size($left) 1
}
              )
            
else  (: both subtrees are non-empty :)

               if($pDepth mod 2 eq 0)
               
then

                  let $subtree := tree:right($pTree),
                     
$vItem := tree:top(tree:leftmost($
subtree)),
                     
$newSubtree :=

                         tree:deleteNode($subtree, $vItem, $pDepth+1)
                   
return

                      (function() {$vItem},
                      
function() {$
left},
                      
function() {$
newSubtree},
                      
function() {tree:size($pTree) 1
}
                       )
               
else

                  let $subtree := tree:left($pTree),
                     
$vItem := tree:top(tree:rightmost($
subtree)),
                     
$newSubtree:=

                         tree:deleteNode($subtree, $vItem, $pDepth+1)
                   
return

                      (function() {$vItem},
                      
function() {$
newSubtree},
                      
function() {$
right},
                      
function() {tree:size($pTree) 1
}
                       )
};

deleteTopNode() does the real deletion work. The operation is straightforward in the case when the tree is empty or either the left subtree or the right subtree is empty.

It becomes interesting when both subtrees are non-empty. In order to understand what the last 20 lines of this code mean, it is best to read a good source on the binary search tree delete operation.

In a few words:

  • If the depth is even, then we move to the top the leftmost node (without its right subtree) of the right subtree.
  • If the depth is odd, then we move to the top the rightmost node (without its left subtree) of the left subtree.

Why is this different processing done, depending on the oddness of the depth of the subtree? If we always used the leftmost node of the right subtree, then a series of deletions would create a severely imbalanced (to the right) tree. Imbalanced trees lose the advantages of the binary search tree with search and insertions becoming of linear complexity (instead of O(log(N))  ) and sorting becoming O(N^2) instead of O(N*log(N)).

Thus, in order to avoid this deteriorating effect, we need to use with a pseudo-random frequency both the left and the right subtrees. In most cases this can be achieved simply by using the oddness of the depth of the tree, whose top node is to be replaced.

Below is the complete code of the Binary Search tree. Do note that I have added a new “property” to the tree datatype – “size”. Also, note the functions tree:leftmost() and tree:rightmost() – used by tree:deleteTopNode()

Another change is that now the second argument of tree:contains()   is of type  item()?   — that is, it can be empty. By definition, any tree contains the “empty item”.

Finally, there is a new pair of functions: tree:serialize() and   tree:deserialize() . The former is a synonym of  tree:print() and the latter loads the XML representation created by  tree:serialize() into a tree.

module namespace tree = http://fxsl.sf.net/data/bintree”;

declare function tree:top
         (
$pTree as (function() as item()*)*
)
         as
item()?

{
if(empty($pTree))
  
then
()
  
else

     $pTree[1]()
};
 
 

 

declare function tree:left
         (
$pTree as (function() as item()*)*
)
        
as (function() as item()*)*

{
if(empty($pTree[2]))
then tree:tree
()
else

   $pTree[2]()
 
  
 

 

declare function tree:right
         (
$pTree as (function() as item()*)*
)
        
as (function() as item()*)*

{
if(empty($pTree[3]))
  
then tree:tree
()
  
else

     $pTree[3]()
};
  
declare function tree:size
         (
$pTree as (function() as item()*)*
)
         as xs:integer
{
if(empty($pTree[4
]))
  
then 0

   else
     $pTree[4]()
};
  
declare function tree:tree()
        
as (function() as item()*)+
{};
   (
    function() {()},   (: top() :)
    function() {()},   (: left() :)
    function() {()},   (: right() :)
    function() {0}     (: size() :)
   )
};
declare function tree:empty
         (
$pTree as (function() as item()*)*
)
         as xs:boolean
{
empty($pTree) or empty($pTree[1
]())
};

 declare function tree:contains
         (
$pTree as (function() as item()*)*,
         
$pItem as item()?
         )
         as xs:boolean
{
 
if(empty($pItem))
  
then true()
  
else
   
if(tree:empty($pTree))
     
then false()
     
else
       
let $top := tree:top($pTree)
        
return
            (
$pItem eq $top)
          
or
            (
$pItem lt $top
           
and
            
tree:contains(tree:left($pTree),$pItem)
             )
          
or
            (
$pItem gt $top
           
and
            
tree:contains(tree:right($pTree),$pItem)
             )
};

(:
Delete from $pTree the node containing $pItem.
:)
declare function tree:deleteNode
         (
$pTree as (function() as item()*)*
,
         
$pItem as item
()
         )
        
as (function() as item()*)*

{
 
tree:deleteNode($pTree, $pItem, 1)
};
declare function tree:deleteNode
         (
$pTree as (function() as item()*)*
,
         
$pItem as item
(),
         
$
pDepth as xs:integer
         )
        
as (function() as item()*)*

{
if(tree:empty($pTree))
  
then $
pTree
  
else

     let $top   := tree:top($pTree),
        
$left  := tree:left($
pTree),
        
$right := tree:right($
pTree)
     
return

         if($pItem eq $top)
          
then tree:deleteTopNode($pTree, $
pDepth)
          
else

            if($pItem lt $top)
             
then

                (
                 
function() {tree:top($pTree)},
                 
function
()
                  {
tree:deleteNode(tree:left($
pTree),
                                  
$
pItem,
                                  
$pDepth+1
)
                  },
                 
function() {tree:right($
pTree)}
                 )
             
else

                (
                 
function() {tree:top($pTree)},
                 
function() {tree:left($
pTree)},
                 
function
()
                  {
tree:deleteNode(tree:right($
pTree),
                                             
$
pItem,
                                            
$pDepth+1
)
                  }
                 )
};
 

declare function tree:deleteTopNode
         (
$pTree as (function() as item()*)*,
         
$pDepth as xs:integer
         )
        
as (function() as item()*)*
{
 
let $left := tree:left($pTree),
     
$right := tree:right($pTree)
   
return
     
if(tree:empty($left) and tree:empty($right))
       
then tree:tree()
       
else
        
if(tree:empty($left) and not(tree:empty($right)))
          
then
           
$right
          
else
           
if(not(tree:empty($left)) and tree:empty($right))
            
then
              
$left
            
else  (: both subtrees are non-empty :)
              
if($pDepth mod 2 eq 0)
               
then
                 
let $subtree := tree:right($pTree),
                     
$vItem := tree:top(tree:leftmost($subtree)),
                     
$newSubtree :=
                        
tree:deleteNode($subtree, $vItem, $pDepth+1)
                   
return
                      (
function() {$vItem},
                      
function() {$left},
                      
function() {$newSubtree},
                      
function() {tree:size($pTree) 1}
                       )
               
else
                 
let $subtree := tree:left($pTree),
                     
$vItem := tree:top(tree:rightmost($subtree)),
                     
$newSubtree:=
                        
tree:deleteNode($subtree, $vItem, $pDepth+1)
                   
return
                      (
function() {$vItem},
                      
function() {$newSubtree},
                      
function() {$right},
                      
function() {tree:size($pTree) 1}
                       )
};

 
(:
  Find/Return the leftmost node of the
  non-empty tree $pTree
:)
declare function tree:leftmost
         (
$pTree as (function() as item()*)+
)
        
as (function() as item()*)*

{
 
let $left := tree:left($pTree)
   
return

      if(tree:empty($left))
       
then $
pTree
       
else tree:leftmost($
left)
};
 

 

 
declare function tree:rightmost
         (
$pTree as (function() as item()*)+
)
        
as (function() as item()*)*

{
 
let $right := tree:right($pTree)
   
return

      if(tree:empty($right))
       
then $
pTree
       
else tree:rightmost($
right)
};
 declare function tree:insert
         (
$pTree as (function() as item()*)*
,
         
$pItem as item
()
         )
        
as (function() as item()*)+

{
  
if(tree:empty($pTree))
     
then

      (
      
function() {$pItem},        (: top()   :)
       function() {tree:tree()}, (: left()  :)
       function() {tree:tree()}, (: right() :)
       function() {1}                     (: size()  :)
       )
     
else
       if($pItem lt tree:top($pTree))
        
then

          (
          
function() {tree:top($pTree)},
          
function() {tree:insert(tree:left($pTree), $
pItem)},
          
function() {tree:right($
pTree)},
          
function() {tree:size($pTree)+1
}
           )
        
else

          if($pItem gt tree:top($pTree))
          
then

           (
           
function() {tree:top($pTree)},
           
function() {tree:left($
pTree)},
           
function() {tree:insert(tree:right($pTree), $
pItem)},
           
function() {tree:size($pTree)+1
}
           )
          
else

            $pTree
};

declare function tree:print
        (
$pTree as (function() as item()*)*
)
        as
element()?

{
  
if(not(tree:empty($pTree)))
   
then

     <treeNode>
      <value>{tree:top($pTree)}</value>
       {
       
tree:print(tree:left($pTree)),
       
tree:print(tree:right($
pTree))
       }
    
</treeNode>

    else ()
};
  

declare function tree:serialize
        (
$pTree as (function() as item()*)*)
        as
element()?
{
  
tree:print($pTree)
};

declare function tree:deserialize($pXml as element()?)
             
as (function() as item()*)*
{
 
if(empty($pXml))
 
then tree:tree()
 
else
   
let $left := tree:deserialize($pXml/treeNode[1]),
       
$right := tree:deserialize($pXml/treeNode[2])
    
return
      (
      
function() {$pXml/value/node()[1]},
      
function() {$left},
      
function() {$right},
      
function() {tree:size($left)+tree:size($right)+1}
      )
};

 

(: tree:data()
  Prints only the data — depth first.
   In effect this is sort() — quite good
   for random data.
:)
declare function tree:data
        (
$pTree as (function() as item()*)*
)
        as
item()*

{
if(not(tree:empty($pTree)))
  
then

    (
    
tree:data(tree:left($pTree)),
    
tree:top($
pTree),
    
tree:data(tree:right($
pTree))
     )
  
else
()
};

Now, let’s test just the deleteNode operation:

let $vBigTree :=
       tree:insert(tree:insert(tree:insert(tree:tree(),10),5),17),
   
$vBigTree2 :=

       tree:insert(tree:insert(tree:insert($vBigTree,3),8),14),
   
$vBigTree3 :=

       tree:insert(tree:insert(tree:insert($vBigTree2,20),6),9),
   
$vBigTree4 :=

       tree:insert(tree:insert(tree:insert($vBigTree3,12),15),7)
   
return

      (‘ tree:print($vBigTree4): ‘,
      
tree:print($
vBigTree4),
      
‘ tree:data($vBigTree4): ‘
,
      
tree:data($
vBigTree4),
      
‘ ====================================’
,
      
‘ tree:print(tree:deleteNode($vBigTree4, 7)): ‘
,
      
tree:print(tree:deleteNode($vBigTree4, 7
)),
      
‘ tree:data(tree:deleteNode($vBigTree4, 7)): ‘
,
      
tree:data(tree:deleteNode($vBigTree4, 7
)),
      
‘ ====================================’
,
      
‘ tree:print(tree:deleteNode($vBigTree4, 12)): ‘
,
      
tree:print(tree:deleteNode($vBigTree4, 12
)),
      
‘ tree:data(tree:deleteNode($vBigTree4, 12)): ‘
,
      
tree:data(tree:deleteNode($vBigTree4, 12
)),
      
‘ ====================================’
,
      
‘ tree:print(tree:deleteNode($vBigTree4, 6)): ‘
,
      
tree:print(tree:deleteNode($vBigTree4, 6
)),
      
‘ tree:data(tree:deleteNode($vBigTree4, 6)): ‘
,
      
tree:data(tree:deleteNode($vBigTree4, 6
)),
      
‘ ====================================’
,
      
‘ tree:print(tree:deleteNode($vBigTree4, 5)): ‘
,
      
tree:print(tree:deleteNode($vBigTree4, 5
)),
      
‘ tree:data(tree:deleteNode($vBigTree4, 5)): ‘
,
      
tree:data(tree:deleteNode($vBigTree4, 5
)),
      
‘ ====================================’
,
      
‘ tree:print(tree:deleteNode($vBigTree4, 10)): ‘
,
      
tree:print(tree:deleteNode($vBigTree4, 10
)),
      
‘ tree:data(tree:deleteNode($vBigTree4, 10)): ‘
,
      
tree:data(tree:deleteNode($vBigTree4, 10
))
      )

And here are the results:

tree:print($vBigTree4):

<treeNode>

      <value>10</value>

      <treeNode>

            <value>5</value>

            <treeNode>

                  <value>3</value>

            </treeNode>

            <treeNode>

                  <value>8</value>

                  <treeNode>

                        <value>6</value>

                        <treeNode>

                              <value>7</value>

                        </treeNode>

                  </treeNode>

                  <treeNode>

                        <value>9</value>

                  </treeNode>

            </treeNode>

      </treeNode>

      <treeNode>

            <value>17</value>

            <treeNode>

                  <value>14</value>

                  <treeNode>

                        <value>12</value>

                  </treeNode>

                  <treeNode>

                        <value>15</value>

                  </treeNode>

            </treeNode>

            <treeNode>

                  <value>20</value>

            </treeNode>

      </treeNode>

</treeNode>

tree:data($vBigTree4):

3 5 6 7 8 9 10 12 14 15 17 20

====================================

tree:print(tree:deleteNode($vBigTree4, 7)):

 

<treeNode>
  
<value>10</value>
  
<treeNode>
     
<value>5</value>
     
<treeNode>
        
<value>3</value>
     
</treeNode>
     
<treeNode>
        
<value>8</value>
        
<treeNode>
           
<value>6</value>
        
</treeNode>
        
<treeNode>
           
<value>9</value>
        
</treeNode>
     
</treeNode>
  
</treeNode>
  
<treeNode>
     
<value>17</value>
     
<treeNode>
        
<value>14</value>
        
<treeNode>
           
<value>12</value>
        
</treeNode>
        
<treeNode>
           
<value>15</value>
        
</treeNode>
     
</treeNode>
     
<treeNode>
        
<value>20</value>
     
</treeNode>
  
</treeNode>
</treeNode>

 

tree:data(tree:deleteNode($vBigTree4, 7)):

3 5 6 8 9 10 12 14 15 17 20

====================================

tree:print(tree:deleteNode($vBigTree4, 12)):

<treeNode>

      <value>10</value>

      <treeNode>

            <value>5</value>

            <treeNode>

                  <value>3</value>

            </treeNode>

            <treeNode>

                  <value>8</value>

                  <treeNode>

                        <value>6</value>

                        <treeNode>

                              <value>7</value>

                        </treeNode>

                  </treeNode>

                  <treeNode>

                        <value>9</value>

                  </treeNode>

            </treeNode>

      </treeNode>

      <treeNode>

            <value>17</value>

            <treeNode>

                  <value>14</value>

                  <treeNode>

                        <value>15</value>

                  </treeNode>

            </treeNode>

            <treeNode>

                  <value>20</value>

            </treeNode>

      </treeNode>

</treeNode>

tree:data(tree:deleteNode($vBigTree4, 12)):

3 5 6 7 8 9 10 14 15 17 20

====================================

tree:print(tree:deleteNode($vBigTree4, 6)):

<treeNode>

      <value>10</value>

      <treeNode>

            <value>5</value>

            <treeNode>

                  <value>3</value>

            </treeNode>

            <treeNode>

                  <value>8</value>

                  <treeNode>

                        <value>7</value>

                  </treeNode>

                  <treeNode>

                        <value>9</value>

                  </treeNode>

            </treeNode>

      </treeNode>

      <treeNode>

            <value>17</value>

            <treeNode>

                  <value>14</value>

                  <treeNode>

                        <value>12</value>

                  </treeNode>

                  <treeNode>

                        <value>15</value>

                  </treeNode>

            </treeNode>

            <treeNode>

                  <value>20</value>

            </treeNode>

      </treeNode>

</treeNode>

tree:data(tree:deleteNode($vBigTree4, 6)):

3 5 7 8 9 10 12 14 15 17 20

====================================

tree:print(tree:deleteNode($vBigTree4, 5)):

<treeNode>

      <value>10</value>

      <treeNode>

            <value>6</value>

            <treeNode>

                  <value>3</value>

            </treeNode>

            <treeNode>

                  <value>8</value>

                  <treeNode>

                        <value>7</value>

                  </treeNode>

                  <treeNode>

                        <value>9</value>

                  </treeNode>

            </treeNode>

      </treeNode>

      <treeNode>

            <value>17</value>

            <treeNode>

                  <value>14</value>

                  <treeNode>

                        <value>12</value>

                  </treeNode>

                  <treeNode>

                        <value>15</value>

                  </treeNode>

            </treeNode>

            <treeNode>

                  <value>20</value>

            </treeNode>

      </treeNode>

</treeNode>

tree:data(tree:deleteNode($vBigTree4, 5)):

3 6 7 8 9 10 12 14 15 17 20

====================================

tree:print(tree:deleteNode($vBigTree4, 10)):

<treeNode>

      <value>9</value>

      <treeNode>

            <value>5</value>

            <treeNode>

                  <value>3</value>

            </treeNode>

            <treeNode>

                  <value>8</value>

                  <treeNode>

                        <value>6</value>

                        <treeNode>

                              <value>7</value>

                        </treeNode>

                  </treeNode>

            </treeNode>

      </treeNode>

      <treeNode>

            <value>17</value>

            <treeNode>

                  <value>14</value>

                  <treeNode>

                        <value>12</value>

                  </treeNode>

                  <treeNode>

                        <value>15</value>

                  </treeNode>

            </treeNode>

            <treeNode>

                  <value>20</value>

            </treeNode>

      </treeNode>

</treeNode>

tree:data(tree:deleteNode($vBigTree4, 10)):

3 5 6 7 8 9 12 14 15 17 20

 

In my next posts I’ll show some of the most important applications of the Binary Search tree.


Advertisement
This entry was posted in functional data structures, XPath, XPath 3.0. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s