The table at the end of this page compares the performance of the original reiser4 key comparison functions (and znode_contains_key_strict) against the new version. The numbers were compiled from five runs. Each run's number is the result of exercising the function 100,000 times and this is repeated five times to obtain the run average.

Here is the code: keytest.zip.

What Was Changed

Key Comparison Functions
All reiser4 key comparisons are built upon the keycmp function, which returns -1 (less than), 0 (equal to) or 1 (greater than).
Here is the function as currently written:

/* 
compare `k1' and `k2'. This function is a heart of "key allocation policy".
All you need to implement new policy is to add yet another clause here.
*/ static inline cmp_t keycmp(const reiser4_key * k1 /* first key to compare */ , const reiser4_key * k2 /* second key to compare */ ) { cmp_t result; /* * This function is the heart of reiser4 tree-routines. Key comparison * is among most heavily used operations in the file system. */ /* there is no actual branch here: condition is compile time constant * and constant folding and propagation ensures that only one branch * is actually compiled in. */ if (REISER4_PLANA_KEY_ALLOCATION) { /* if physical order of fields in a key is identical with logical order, we can implement key comparison as three 64bit comparisons. */ /* logical order of fields in plan-a: locality->type->objectid->offset. */ /* compare locality and type at once */ result = KEY_DIFF_EL(k1, k2, 0); if (result == EQUAL_TO) { /* compare objectid (and band if it's there) */ result = KEY_DIFF_EL(k1, k2, 1); /* compare offset */ if (result == EQUAL_TO) { result = KEY_DIFF_EL(k1, k2, 2); if (REISER4_LARGE_KEY && result == EQUAL_TO) { result = KEY_DIFF_EL(k1, k2, 3); } } }
}
return result; }

The macro KEY_DIFF_EL is defined as: /* helper macro for keycmp() */ #define KEY_DIFF_EL(k1, k2, off) \ ({ \ __u64 e1; \ __u64 e2; \ \ e1 = get_key_el(k1, off); \ e2 = get_key_el(k2, off); \ \ (e1 < e2) ? LESS_THAN : ((e1 == e2) ? EQUAL_TO : GREATER_THAN); \ }) static inline __u64 get_key_el(const reiser4_key * key, reiser4_key_field_index off) { return d64tocpu(&key->el[off]); }

So, under a little-endian architecture, the above code would expand to something like:

result = (k1.el[0].datum < k2.el[0].datum) ? LESS_THAN : (k1.el[0].datum = = k2.el[0].datum) : EQUAL_TO : GREATER_THAN;
if (result == EQUAL)
   result = (k1.el[1].datum < k2.el[1].datum) ? LESS_THAN : (k1.el[1].datum == k2.el[1].datum) : EQUAL_TO : GREATER_THAN;
   if (result == EQUAL)
   /* test next key element... */

return result;

Note that functions built upon keycmp, keylt for instance, simply test the cmt_t result from keycmp:

keylt ( k1, k2 ) {
   return ( keycmp(k1, k2) == LESS_THAN);
}


The new code eliminates conditionals in keycmp as well as in the functions that wrap it.
Here are the new keycmp and keylt functions:

/* return type from keycmp */
typedef enum { LESS_THAN_NEW=0, GREATER_THAN_NEW=1, EQUAL_TO_NEW=2 } cmp_t_new;

static inline cmp_t_new
keycmp_new( const reiser4_key * k1 /* first key to compare */ ,
            const reiser4_key * k2 /* second key to compare */ )
{
   __u64 v1, v2;
  
   /*
      Calling this version with
      k1[0] < k2[0], will make 1 64-bit compare and one branch.
      k1[0] = k2[0], will make 1 64-bit compare and one branch (and so on)
      k1[0] > k2[0], will make 1 64-bit compare and one branch.
     
      Calling the original function with
      k1[0] < k2[0], will make 1 64-bit compare, 1 int compare and two branches.
      k1[0] = k2[0], will make 1 64-bit compare, 1 int compare and three branches (and so on)
      k1[0] > k2[0], will make 1 64-bit compare, 1 int compare and three branches.

      per key element compared.

   */
   if ((v1= get_key_el(k1,0)) != (v2=get_key_el(k2,0)))
      return (v1 > v2);
   if ((v1=get_key_el(k1,1)) != (v2=get_key_el(k2,1))) 
      return (v1 > v2);
   if ((v1= get_key_el(k1,2)) != (v2=get_key_el(k2,2)))
      return (v1 > v2); 
   if ((v1=get_key_el(k1,3)) != (v2=get_key_el(k2,3)))
      return (v1 > v2);
  
   return EQUAL_TO_NEW;
}



/*
   keylt

   Calling this version with
   k1[0] < k2[0], will make 1 64-bit compare and one branch.
   k1[0] = k2[0], will make 1 64-bit compare and one branch.
   k1[0] > k2[0], will make 1 64-bit compare and one branch.
     
   Calling the original function
   k1[0] < k2[0], will make 1 64-bit compare, 2 int compares and two branches.
   k1[0] = k2[0], will make 1 64-bit compare, 2 int compares and three branches.
   k1[0] > k2[0], will make 1 64-bit compare, 2 int compares and three branches.
 
*/
static inline int
keylt_new( const reiser4_key * k1 /* first key to compare */ , 
           const reiser4_key * k2 /* second key to compare */ )
{
   __u64 v1, v2;

   if ((v1= get_key_el(k1,0)) != (v2=get_key_el(k2,0)))
      return (v1 < v2);
   if ((v1=get_key_el(k1,1)) != (v2=get_key_el(k2,1)))
      return (v1 < v2);
   if ((v1= get_key_el(k1,2)) != (v2=get_key_el(k2,2)))
      return (v1 < v2);
 
   return (get_key_el(k1,3) < get_key_el(k2,3));
}

The cmp_t enum is redefined because the comparison in keycmp when encountering a non-equal key element is >. This makes LESS and GREATER be 0 and 1 (boolean comparison result) respectively, with EQUAL defined as 2. The other comparison functions are no longer defined "on top of" keycmp. They each use the comparison for which they are named.   

Here is a comparison of the assembler instructions for keycmp vs the new version.


znode_contains_key_strict
This function is currently built with calls to keyge and keycmp. The new, purpose-built version uses the new keyge and an inline version of the new keylt code in order to accommodate the isunique parameter. Here are the function definitions:

/*
   true if @key is strictly within @node we are looking
   for possibly non-unique key and it is item is at the edge of @node.
   May be it is in the neighbor.
*/
static int
znode_contains_key_strict( znode * node /* node to check key against */ ,
                           const reiser4_key * key /* key to check */,
                           int isunique)
{
   int answer;

   if (keyge(key, &node->rd_key))
      return 0;

   answer = keycmp(&node->ld_key, key);
  
   if (isunique)
      return answer != GREATER_THAN;
   else
      return answer == LESS_THAN;
}


/* new version */
static int
znode_contains_key_strict_new( znode* node            /* node to check key against */,
                               const reiser4_key * sk /* key to check */, 
                               const int isunique )
{
   __u64 v1, v2;

   if (keyge_new(sk, &node->rd_key))
      return FALSE;

   do {
     if((v1=get_key_el(&node->ld_key,0))!=(v2=get_key_el(sk,0)))
        break;
     if((v1=get_key_el(&node->ld_key,1))!=(v2=get_key_el(sk,1)))
        break;
    if((v1=get_key_el(&node->ld_key,2))!=(v2=get_key_el(sk,2)))
         break;

     v1 = get_key_el(&node->ld_key,3); 
     v2 = get_key_el(sk,3);

   } while(0);

   /*
      'trick' with isunique addition removes a branch. 
      isunique initializer must be changed to be the 
      result of a boolean comparison of the flag value &
      (h->flags & CBK_UNIQUE).      
   */
   return (v1 < (v2 + isunique));

}  

Further znode_contains_key_strict Improvements.
There is a second version that leverages an as-yet-unimplemented znode structure member: delim_key_diff_el. This unsigned char holds the key element at which the ld_key is smaller than the rd_key. With this "hint," znode_contains_key_strict only needs to make one 64-bit key el comparison between the search key and the ld_key when the search key differs from the rd_key at the delim_key_diff_el element. Otherwise, it is less than or greater than depending upon whether the diff_el is less than or greater than the delim_key_diff_el. Feedback is welcome as to whether the the apparent cache scanning performance increase justifies the storage and maintenance costs for the new znode member.

static int
znode_contains_key_strict_hint( znode* node /* node to check key against */,
                                const reiser4_key * sk /* key to check */,
                                const int isunique )
{
   __u64 v1, v2;
   unsigned char el=0;

   do {
      if ((v1=get_key_el(&node->rd_key,el)) != (v2=get_key_el(sk,el)))
         break;

      /* 
         could be explicit el=1 etc.,
         but gcc was smart enough to detect this.
      */

      el++;   
      if ((v1=get_key_el(&node->rd_key,el)) != (v2=get_key_el(sk,el)))
         break; 
      
      el++;
      if ((v1=get_key_el(&node->rd_key,el)) != (v2=get_key_el(sk,el)))
         break;
     
      el++;
      v1 = get_key_el(&node->rd_key,el);
      v2 = get_key_el(sk,el);

   } while(0); 

   /*
     this conditional and the next were originally in one
     || test, but the performance was not what I expected. I
     looked at the assembler and gcc decided to test the 'easier'
     char comparison first, but this is not what the logic requires...
   */
   if (v2 >= v1)
      return FALSE;

   /*
      the addition with isunique bypasses the extra conditional in the
      original code and appears to give correct results. isunique should 
      be changed to 0 or 1 instead of the result of (h->flags & CBK_UNIQUE).

      there's an extra bonus in this purpose-built function for
      big endian systems. v2 contains the swab()ed value from the 
      last comparison with the rd_key, thus saving swab() call(s) in
      addition to the comparison(s).
   */
   if(el == node->delim_key_diff_el)
      return get_key_el(&node->ld_key, el) < (v2+isunique);

   return (el > node->delim_key_diff_el);

}

Impact to the System

Caveats


Test Notes

Further Opportunities

Search For Function Data Total Orig/New
cache[15].ld_key+1 znode_contains_key_strict      Average CPU time 0.27960
    Min CPU Time 0.27840
    Max CPU Time 0.28040
  znode_contains_key_strict_hint Average CPU time 0.05816 4.8074
    Min CPU Time 0.05800
    Max CPU Time 0.05820
  znode_contains_key_strict_new  Average CPU time 0.09444 2.9606
    Min CPU Time 0.09400
    Max CPU Time 0.09620
MAXIMAL_KEY znode_contains_key_strict      Average CPU time 0.14140
    Min CPU Time 0.14020
    Max CPU Time 0.14420
  znode_contains_key_strict_hint Average CPU time 0.06620 2.1360
    Min CPU Time 0.06420
    Max CPU Time 0.07020
  znode_contains_key_strict_new  Average CPU time 0.03720 3.8011
    Min CPU Time 0.03600
    Max CPU Time 0.03800
MINIMAL_KEY znode_contains_key_strict      Average CPU time 0.09720
    Min CPU Time 0.09600
    Max CPU Time 0.09800
  znode_contains_key_strict_hint Average CPU time 0.02560 3.7969
    Min CPU Time 0.02400
    Max CPU Time 0.02600
  znode_contains_key_strict_new  Average CPU time 0.03420 2.8421
    Min CPU Time 0.03420
    Max CPU Time 0.03420
n/a keycmp       Average CPU time 0.13660
    Min CPU Time 0.13620
    Max CPU Time 0.13820
  keycmp_new   Average CPU time 0.06660 2.0511
    Min CPU Time 0.06620
    Max CPU Time 0.06820
  keyeq     Average CPU time 0.21880
    Min CPU Time 0.21640
    Max CPU Time 0.22840
  keyeq_new Average CPU time 0.04720 4.6356
    Min CPU Time 0.04600
    Max CPU Time 0.04800
  keyge     Average CPU time 0.13500
    Min CPU Time 0.13420
    Max CPU Time 0.13620
  keyge_new Average CPU time 0.05840 2.3116
    Min CPU Time 0.05800
    Max CPU Time 0.06000
  keygt     Average CPU time 0.13460
    Min CPU Time 0.13220
    Max CPU Time 0.13620
  keygt_new Average CPU time 0.05400 2.4926
    Min CPU Time 0.05400
    Max CPU Time 0.05400
  keyle     Average CPU time 0.14944
    Min CPU Time 0.14820
    Max CPU Time 0.15440
  keyle_new Average CPU time 0.04256 3.5113
    Min CPU Time 0.04020
    Max CPU Time 0.04420
  keylt     Average CPU time 0.14820
    Min CPU Time 0.14620
    Max CPU Time 0.15020
  keylt_new Average CPU time 0.03840 3.8594
    Min CPU Time 0.03800
    Max CPU Time 0.04000


(C) 2004 David Dabbs david at dabbs dot net, in accordance with reiser4/README.