Find the k-closest elements to a given value in a sorted array

We are given a sorted array, value V, and a number K. The value represents the number that may appear in the array, but it’s not mandatory. We are expected to find a set of size K, that contains the elements from the array, that are the closest to the value .

The solution uses the famous binary search algorithm. Firstly we find the value V itself, or the closest to V, in the array. The latter, we obtain just as a result of a regular binary search, that returns the position of the element if it exists, or the position where it would appear. Let’s call this position P. We know that the element at position P, will be included in the result set. Now we need to consider the elements on the left and the right hand side of P. As the array is sorted, the closest element to the one on the position P, will lay on the position P-1, or P+1. We choose the one that is closer, and iteratively process till we fill the set to the expected size. An example code:

unsigned int binary_search(vector<int> array, int el){
 unsigned int begin = 0;
 unsigned int end = array.size() - 1;
 unsigned int medium;
 while(begin < end){
  medium = begin + (end - begin)/2;
  if(array < el)begin = medium + 1;
  else end = medium;
 }
 return medium;
}

vector<int> get_k_closest(vector<int> array, int el, unsigned int k){
 unsigned int split = binary_search(array, el);
 int l = split, r = split + 1;
 unsigned int counter = k;
 vector<int> result;
 while(counter){
  if(l >= 0 && r < array.size()){
   // choose the element that is closer
   if(abs(array[l] - el) < abs(array[r] - el))
    result.push_back(array[l--]);
   else result.push_back(array[r++]);
  } else {
   if(l >= 0)
    result.push_back(array[l--]);
   else if(r < array.size())
    result.push_back(array[r++]);
  }
  counter--;
 }
 return result;
}

Find the missing number of an arithmetic progression

We are given an array representing the elements of an increasing arithmetic progression. The arithmetic progression might be described by a formula a_n = a_1 + (n - 1) r , that means that n-th element of it, is a sum of the first element and (n – 1) steps r. We are expected to find the missing number in the array taking an assumption that the first and last element are never missing.

To make the problem more understandable let’s take a following example of an array: [1 3 5 7 9 11 15 17]. W can see that the a_1 = 1, and the r might be calculated by making use of the assumption that the first at the last element are never missing r = \frac{(a_n - a_1)} {(n - 1) } = (17 - 1)/8 = 2. We can just check by taking a deeper look at the table that the missing element is 14. Actually we can just scan the table from left to right and compare consecutive elements and stop when the difference between them is different than expected. It leads to O(n) solution. But it smells like we can go better.

As the progression is increasing and we know the precise formula for the element at n-th position we may apply the famous binary search. We just use the fact that if at the n-th position the element is bigger than expected, then the missing number has to be on the left side of the n-th, otherwise the missing number has to be on the right side. However before approaching the previous test there is a necessity to check the current element and the previous one, as the missing element may lay just between them.

The pseudocode could look like follows:

if array[n/2] - array[n/2 - 1] > r then
// we got a missing element that is array[n/2] - r
if array[n/2] > array[begin] + (n/2 - 1)r
// then we we search left part of the array
if array[n/2] = array[begin] + (n/2 - 1)r
// then consider right part of the array

We could code it as:

int findMissingElement(int array[], int begin, int end, int r)
{
    int m = (end - begin)/2;
    if(array[m] - array[m - 1] != r)
        return array[m] - r;
    if(array[m] > array[begin] + (m - 1) * r)
    {
        return findMissingElement(array, begin, m - 1, r);
    }
    else if(array[m] == array[begin] + (m - 1) * r)
    {
        return findMissingElement(array, m + 1, end, r);
    }
}

The complexity is obviously O(log N).

Best Regards

Joe Kidd

Check if a number is a power of 4

Input: A number N
Output: True if N is a power of 4, False otherwise
Problem: We are given a number, and we need to check if it’s a power of 4
Example: 64, result = True
Complexity: Time O(log log n), Memory 0(1)

This problem is very trivial and basically we could solve it, in a iterative way, just checking consecutives powers of four and compare them against N. But if it were that simple, it wouldn’t appear here. Moreover we have a complexity limit that is O(log log n). Checking every power of 4 would be efficient, especially if we would use bit shifting instead of normal power, but not efficient enough, as we would get O(log n) solution.

Before we will discuss the real solution let’s take a look at the binary representation of powers of 4.

4^0 = 0000001
4^1 = 0000100
4^2 = 0010000
4^3 = 1000000

As we can see the bit is set only at the odd positions and only once. How can check this condition in a effective way? We can use a powerful binary search again.

1. We divide the number bit representation into two pieces.
2. Only one piece might be not zero, if this condition is not met, the number is not a power of 4.
3. Choose a non zero part and go to step 1. as long as you find number 1, then go to step 4.
4. Check if the bit is on the odd position, if not return false, as the number is not a power of 4 otherwise go to step 5.
5. Flip the bit from the step 4, if the number euqals zero now, it is a power of 4 , otherwise it’s not.

This algorithm would give us the expected O(log log n) complexity.

There is also another solution, that is way easier to implement and works in O(1) time. As we can see from the example above, the bits are set on the odd positions. But let’s go in a opposite way and think about even positions. We know, that no even position might be set. Secondly we know that power of 4 is also a power of 2. If this condition are met, then we have a power of 4 number.


bool isPowerOf4(int N)
{
int mask = 0xAAAAAAAA; // mask with set values at the even position
return (N & mask == 0) && (N & (N - 1) == 0);
}

This solution is prettier, but I like more the one with binary search ans it shows an interesting application of this algorithm.

Best Regards

Joe Kidd

One-sided (meta) binary search

The concept described in this article changes a little bit the way we look at the binary search. I have met also a different name of this approach on topcoder pages, where it’s called meta-binary search. The second time I met this algorithm in the Skiena’s book “Algorithms Manual Design”, where one-sided binary search is used to describe it.

Honestly, I prefer the Skiena’s name, as it seems to be more accurate and describe what really happens. The classical binary search algorithm operates on a sub-range of values, like implemented on the related wiki page. As you can see, there is an upper and lower boundary provided, defining the sub-range of the array to be searched. Instead of considering the boundaries and finding the mid point, we can actually consider only the mid point – and this is what the meta binary search does. As an example is better than precept, let’s take a look at some.

Let’s say we have a sorted array and we need to find a given value. Just not to create messy examples, let’s consider the simplest one we can imagine: [1,2,3,4,5,6,7]. If we look for a number six, it will be find at the position 5 (0-indexed array).

In the first step we need to find a value of log N, that will be useful for comparing elements and choosing the proper part of the array to analyse. Of course log is 2-based. We can do it using simple iteration, like this:

int getLogOf(int N)
{
    int log2 = 0
    while((1 << log2) <= N)log2++;
    return --log2;
}

What this value really does, it is explained in the listening below, but in the “human being” words, we may define it as a distance to the next mid element.

Then we can run the search itself, using the following piece of code:

int find(int array[], int N, int K)
{
    int log2 = getLogOf(N);
    int currentPosition = 0;
    while(log2 >= 0)
    {
       // check if we got a solution
       // if yes, the return it
       if(array[curPosition] == K)
           return curPosition;
       // check the next postion
       int newPosition = curPosition + (1 << log2);
       // if it's smaller or equal then we have to
       // update the current position
       if(newPosition < N && array[newPosition] <= K)
           curPosition = newPosition;
       // decrement the skip size
       log2--;
    }
return -1;
}

So we have an array [1,2,3,4,5,6,7], and we look for 6. Because the size of the array is 7, we know that the log base 2 of 7 is 2 (2^2 = 4, but 2^3 = 8, so we take 2 as the less evil). Let’s see the iterations, the bold element indicates currently considered point:

Iter1: [1,2,3,4,5,6,7]
  log=2, currentPosition = 0, newPosition = 0 + 2 ^ 2 = 4, array[newPosition] = 5 < 6
  currentPosition = newPosition
  log--
Iter2: [1,2,3,4,5,6,7]
  log=1, currentPosition = 4, newPosition = 4 + 2 ^ 1 = 6, array[newPosition] = 7 > 6
  log--
Iter2: [1,2,3,4,5,6,7]
  log=1, currentPosition = 4, newPosition= 4 + 2 ^ 0 = 5, array[newPosition] = 6 == 6
  return 5

I hope this article makes you a little bit more familiar with the concept of one sided binary search, that is a very handy tool – check the top coder lowest common ancestor article to know why!

Best Regards

Joe Kidd

Find the only one non-repeating value in a sorted array

There is a sorted array, where every element appears in the array twice, except one, that appears once. The task is to find this element. As an example we may consider [1, 1, 2, 3, 3, 5, 5] and the element that we look for is 2. Can we achieve it in O(log n) and in place?

Let’s consider what we know already about the problem.
(1) The information, that the array is sorted, is pretty useful, as it means all the repeated elements are consecutive (i.e 1,1 or 3,3). Moreover we know that the element, that we are looking for, has to be located somewhere between those continuous 2-elements sequences (i.e 1,1, 2, 3, 3 is correct, but 1, 2, 1, 3, 3 is not).

(2) All the elements appears twice, except one: it means that the size of the array is an odd number.

(3) If there is no extra element, it should contain only elements repeated twice and it should contain each of them. I mean in this case we would have [1, 1, 3, 3], [4, 4, 5, 5] so every occurrence of 1 and 3 appears in the left part. If an extra element is introduced to this part, we would have an inbalance of the amount of occurrences of the last element (it would overlow to the right part). It would be something like [1, 1, 2, 3], [3, 4, 4, 5, 5].

As we have just identified the properties, we can see that the non repeating number is located in the odd-size part of the array. The binary search always divides the search space into two pieces. Dividing an odd size space, gives two subspaces – one of the even size, and the second one of the odd size.

Unfortunately, dividing the array into two subarrays doesn’t give as any information which half is the odd size, and which is the even size. But we can divide the array arbitrary, so that the first half is always even size. Then comparing the last element L of the left subarray to the first element R of the right subarray, is sufficient to establish, in which half the extra element is located. If L != R then we need to check the second half of the array, otherwise the left one.

On the base of the previous paragraph we can develop an algorithm described in the pseudocode below.

int findOneElement(int array[], int size)
{
 if(size == 1)
     return array[0];

 int medium = size/2;
 // make the size even number
 medium = medium % 2 == 0 ? medium : medium + 1;

 if(array == array)
 {
     // look in the left subarray
     return findOneElement(array, medium - 1);
 }
 else
 {
    // look in the right subarray
    return findOneElement(array + medium + 1, size - (medium + 1));
 }

}

The complexity is obviously O(log n) as we are using binary search. Moreover we don’t use any extra memory, so we get O(1). All the requirements are met.

Best Regards

Joe Kidd