The coding challenge

Before I go into too much detail about where the coding challenge came from and what the results were, I urge you most strongly to attempt it for yourself. Believe me, you won't get half as much out of the results if you haven't crashed and burned for yourself. To this end, this page does not take you to the results, analysis and discussion. To do that you have to read at least some of these pages to find the magic key, or submit your own code for testing.

The coding challenge arose from the belief that complex code simply cannot be tested thoroughly. The world of testing works very hard to ensure that both cases and code are both covered, but still there are bugs that slip through undetected, only to surface at the most inconvenient times.

Given that live code will not, generally, have been sufficiently tested, we therefore must rely on due diligence from the programmers. Most programmers will claim that when they try hard they can write good code, and the coding challenge has been designed to show them that they are probably wrong.

The coding challenge is a distillation of the problem. In it, a programmer is asked to take extreme care and write a short routine that is not to be tested at all. This is an attempt to capture code that has been written carefully, but which the full rigours of a test program has missed. The evidence of bugs in production programs shows clearly that such untested code or conditions exist, so this is a way of forcing that situation.

Is this a fair challenge? Judge for yourself. The specification is detailed, you have as much time as you like. The only restriction is that you must not run your code on a machine. Given that the routine to be coded is well-known, it seems fair that you are also asked to write the routine for yourself from scratch, and not look it up in any external source.

Specification

Overview

Write a routine to implement the binary search algorithm. Your routine will take a sorted array, a key, and return either the index of the first occurrence of the key in the array, or the value NotFound if the element is not in the array.

The algorithm

This routine is given a sorted array, possibly with repetitions, and an element to find in the array. It proceeds by comparing the key with the element in the middle of the array. Based on the result of this comparison it can reject half of the array and continue. Eventually there is only one possibility left.

Simple analysis shows that if there are n elements in the array then this routine takes O(log n) operations.

The Specification

Write a non-recursive binary search routine in C. The prototype for the routine is

int find( Object Array[],
          int n,
          Object Key,
          int (*cmp)(Object *, Object *),
          int NotFound
        );

In this prototype the arguments are as follows:

Object Array[] :
An array of elements of type Object. You know nothing more about these.

int n :
The number of elements in the array. Note: the array is indexed from 1, so that if 1 <= i <= n then element A[i] exists.

Object Key :
The Object to be found.

int (*cmp)(Object *, Object *) : See below.

int NotFound :
This is the value to be returned by the routine if the element Key is not in the array.

The function cmp passed in to routine find is your means of comparing the sizes of two things of type Object. This function is similar to the function strcmp, and has the following properties.

If p, q and r all point to things of type Object, we have the following facts:

  • cmp(p,q) takes one of the values -1, 0 or 1,
  • cmp(p,p) = 0,
  • cmp(p,q) = -cmp(q,p),
  • cmp(p,q)<=0 and cmp(q,r)<=0 together imply that cmp(p,r)<=0.

The array is sorted, so you know that for all integers i and j such that 0<i<=j<=n we have

cmp( &Array[i] , &Array[j] ) <= 0

or, equivalently,

cmp( Array+i , Array+j ) <= 0

If there is some index i such that cmp(&Array[i],&Key) returns 0 then routine find must return the smallest such index. If no such index exists then routine find must return the given value NotFound. Routine find must make no more than (int)(log(n)/log(2)+2) calls to function cmp.

Comments

Below is a linear search program demonstrating the calling convention and use of comparisons, etc. Please note that this example does not meet the specification. It is included here merely to demonstrate the usage of the parameters.

int find( Object Array[], int n, Object Key,
          int (*cmp)(Object *, Object *),
          int NotFound
        ) 
     {
       int i;
       for (i=n;i>=1;i--)
         if (cmp(&Array[i],&Key)==0) return i;
       return NotFound;
     }