K-way merge

There is a certain number if sorted arrays and each of them contains numbers. We want to take an element from every array in such a way, that the difference between the elements is the lowest.

For simplicity let’s say N=3 and the arrays are like:

A = {4, 10, 15, 20}
B = {1, 13, 29}
C = {5, 14, 28}

Now we need to choose one element from each of array, let’s call them a, b, c, such that |a – b| + |b – c| + |c – a| is minimal. For this case the solution is a = 15, b = 13, c = 14. But how to find it?

For solving the problem we are going to use minimal heap data structure with constant size. In our case the size of the heap is equal to number of arrays we process, so the size = 3. In the first step we take the minimal element from the array, and we store the difference of its sum. Now we are going to remove the element from the top of the heap and replace it with the next element from the array the removed element comes from. Inserting the a new element into the heap will rearrange the elements so that the minimal one will be always on the top. We store again the new difference if it’s smaller than the previous one and we repeat until one of the arrays becomes empty. Let’s take a look at an example:

H = {1, 4, 5}, diff = 8
we remove 1 and put 13 as 1 belonged to the same array
H = {4, 5, 13}, diff = 18
we remove 4 and put 10 as 4 belonged to the same array
H = {5, 10, 13}, diff = 16
we remove 5 and put 14 as 5 belonged to the same array
H = {10, 13, 14}, diff = 8
we remove 10 and we put 15 as 10 belonged to the same array
H = {13, 14, 15}, diff = 4
...

I was inspired to describe this solution, that is called k-way merged, after meeting several questions related to it on careercup – but without a very easy explanation. I hope it helps a bit. Actually it helps at least me 🙂

Best Regards

Joe Kidd

Dutch flag problem – variation

We are given an array and we need to sort it in the way, so that all negative numbers go to the beginning, all positive numbers to the end and the relative position of numbers is not changed. In example: [-1, 4, 0, -2, 2] => [-1, -2, 0, 4, 2].

Basically I have promised to myself to stop updating this blog for a while, as I am kinda busy recently, but this question surprised me and couldn’t imagine there is a such long discussion related to it.

The problem has been described at careercup.com, and I can’t believe no one has noticed such a simple solution. So basically we are given an array of numbers, where some of them are negative and some of them positive. We need to sort the array in the way, negative numbers go first, positive after, but the relative position is not changed. This means that if we have an array [-1,3,-3,2] the answer should be [-1,-3,3,2]. So what we need is a stable sort. However it would be still not enough. But what if we don’t perceive numbers as numerical values, but we assign them one of three types: positive, null, negative? Then we are done actually.

In the other words, let’s assign all the negative numbers the value -1 and all positive numbers value 1. Let 0 be 0 😉 So we may write a following implementation in c++:

bool cmp(int a, int b)
{
// consider negative numbers as the same class
if(a < 0 && b < 0) return false;
// consider positive numbers as the same class
else if(a > 0 && b > 0) return false;
// otherwise normal comparison will do. This includes
// also the case when a = 0 or b == 0 then we use the normal comparison
else return a < b
}

void special_sort(vector &amp; special_vector)
{
stable_sort(
  special_vector.begin(),
  special_vector_end(),
  cmp);
}

So what we did, we just solved this much discussed problem using nothing more than sorting and STL library to get the complexity of O(n logn) that matches the best solutions from careercup.com 🙂

Best Regards

Joe Kidd

Longest Arithmetic Progression

Input: A sorted array of numbers.
Output: The length of the longest arithmetic progression
Problem: We are given a sorted array and we need to find the length of the longest arithmetic progression..
Example: [1, 9, 10, 11, 16, 21, 28, 30, 31], answer=4.
Complexity: Time O(n^2), Memory 0(n^2)

Before reading this article, it’s strongly recommended to take a look at this one, as it describes the basic idea we gonna use in the solution.

The solution will base on a dynamic programming approach, as the problem has a property of optimal substructure – we can build arithmetic progression of length N if we already know one with lenght (N-1).

Let’s consider three indices L, M, R, L < M < R, and let’s use the property describing arithmetic sequence mentioned in the previous article. If we know three indices L,M,R that are in a arithmetic progression, and we know that M, R, Q indices are also in the aritmetic progression, then L,M,R,Q are aslo in arithmetic progression. We will use this property to create dynamic programming solution.

Another thing worth considering is that every pair of elements is a arithmetic progression already, what gives us initialization property of an array.

We will have A[L][M] array, where we will store the longest progression length, starting with L, M. We will process the progression from right to left, trying to find L and R for a fixed value of M. The code may look like this:

int longestProgression(vector<int> iProgression)
{
    int A[iProgression.size()][iProgression.size()];
    int maxLen = 2;
    // initialize last columnt values
    for(int i = 0; i < iProgression.size(); i++)
        for(int j = 0; j < iProgression.size(); j++)
            A[i][j] = 2;

    for(int M = iProgression.size() - 1; M > 0; M--)
    {
       int L = M - 1;
       int R = M + 1;
       while(L >= 0 && R < iProgression.size())
       {
           int LR = iProgression[L] + iProgression[R];
           int M2 = IProgression[M] * 2;
           if(LR > M2)
           {
               L--;
           }
           else if(LR < M2)
           {
               R++;
           }
           else
           {
               A[L][M] = A[M][R] + 1;
               maxLen = max(A[L][M], maxLen);
               L--; R++;
           }
       }
    }
    return maxLen;
}

The complexity of the solution is O(n^2) and the memory is O(n^2) too.

Find 3-elements long arithmetic progression in a sorted array

So we are given an array that is sorted in ascending order, and we are expected to find 3-elements that are arithmetic progression. As the O(n^3) solution is trivial we are expected to find a better one. Obviously the elements are spread in the array, but the order is maintained.

As an example we may consider the array [1, 9, 10, 11, 16, 21, 30] that has the arithmetic progression [1,11,13]. The trick that is going to lead to the O(n^2) solution bases on the property of an arithmetic progression, that says: 3 elements L, M, R are arithmetic progression when L < M < R. and 2 * M = L + R.

Cool! As the array is sorted L lays on the left hand side of M, and R on the right hand side. So for a fixed M, we need to find L and R meeting the condition above. We may firstly fix L = M – 1, and R = M + 1 and the iterate updating the values of L and R according to the following rule:

if array[L] + array[P] > 2 * array[M] then L--
if array[L] + array[P] < 2 * array[M] then P++
if array[L] + array[P] == 2 * array[M] then we have the answer

What leads us to a implementation of O(n^2).

bool hasArithmeticProg(vector<int> array)
{
    if(array.size() < 3)
        return false;
    
    for(int M = 1; M < array.size() - 1; M++)
    {
        int L = M - 1;
        int R = M + 1;
        while(L >= 0 && R < array.size())
        {
            if(array[L] + array[R] == 2 * array[M])
                return true;
            else if(array[L] + array[R] > 2 * array[M])
                L--;
            else R++;
        }
    }
    return false;
}

The problem is not really about finding the arithmetic progression, is rather about efficient way of finding if there are two elements in a sorted array that sum up to a given number. The arithmetic progression is just an extra thing to make it more funny.

Best Regards

Joe Kidd

Find the only one non-repeating value in a sorted array

There is a sorted array, where every element appears in the array twice, except one, that appears once. The task is to find this element. As an example we may consider [1, 1, 2, 3, 3, 5, 5] and the element that we look for is 2. Can we achieve it in O(log n) and in place?

Let’s consider what we know already about the problem.
(1) The information, that the array is sorted, is pretty useful, as it means all the repeated elements are consecutive (i.e 1,1 or 3,3). Moreover we know that the element, that we are looking for, has to be located somewhere between those continuous 2-elements sequences (i.e 1,1, 2, 3, 3 is correct, but 1, 2, 1, 3, 3 is not).

(2) All the elements appears twice, except one: it means that the size of the array is an odd number.

(3) If there is no extra element, it should contain only elements repeated twice and it should contain each of them. I mean in this case we would have [1, 1, 3, 3], [4, 4, 5, 5] so every occurrence of 1 and 3 appears in the left part. If an extra element is introduced to this part, we would have an inbalance of the amount of occurrences of the last element (it would overlow to the right part). It would be something like [1, 1, 2, 3], [3, 4, 4, 5, 5].

As we have just identified the properties, we can see that the non repeating number is located in the odd-size part of the array. The binary search always divides the search space into two pieces. Dividing an odd size space, gives two subspaces – one of the even size, and the second one of the odd size.

Unfortunately, dividing the array into two subarrays doesn’t give as any information which half is the odd size, and which is the even size. But we can divide the array arbitrary, so that the first half is always even size. Then comparing the last element L of the left subarray to the first element R of the right subarray, is sufficient to establish, in which half the extra element is located. If L != R then we need to check the second half of the array, otherwise the left one.

On the base of the previous paragraph we can develop an algorithm described in the pseudocode below.

int findOneElement(int array[], int size)
{
 if(size == 1)
     return array[0];

 int medium = size/2;
 // make the size even number
 medium = medium % 2 == 0 ? medium : medium + 1;

 if(array == array)
 {
     // look in the left subarray
     return findOneElement(array, medium - 1);
 }
 else
 {
    // look in the right subarray
    return findOneElement(array + medium + 1, size - (medium + 1));
 }

}

The complexity is obviously O(log n) as we are using binary search. Moreover we don’t use any extra memory, so we get O(1). All the requirements are met.

Best Regards

Joe Kidd