Find the K-closest point to the origin

There is an unsorted array containing the coordinates of points in a 2D space and a positive number k. We are ask to find the the k-closest point to the origin, it is to the point with coordinates (0, 0). In example for an array [(5,5), (-4,3), (3,2), (1,0), (6, -5)], and k = 2, the answer would be (3, 2).

The first approach, we could use, is sorting. I mean firstly storing the distance to the origin in an array by extending the point structure, and then sorting it according to the newly added field. After that, we could find the answer in the k-th cell of the array. This solution would O(n + n log n) and unfortunately it’s not optimal. There is O(n) solution.

The problem is a special case of the quickselect algorithm, that uses the same partition procedure, as quicksort. But before we can apply the quickselect approach, we need to do some extra pre-processing related to distance calculation.

Let’s suppose the structure of our Point, except coordinates x, y, contains also an extra value called distance. So we get a code like this:

struct Point
{
  double x, y;
  double distance;
};

Now, in the first step of the algorithm, we need to calculate the distance from the origin to each point contained in the array.

void setDistanceFromOrigin(Point points[], int size)
{
    for(int i = 0; i < size; i++)
        points[i].distance = sqrt(points[i].x * points[i].x +
                    points[i].y * points[i].y);
}

Then we can apply the quickselect algorithm for finding the k-smallest element, using the calculated distance for comparison. So far it sounds like quicksort. But in this case we are not going to sort the whole array, but only the part, where the element of interests lays. And actually sorting is a little bit too strong word here. We will pivot the array around the mid point so that lower elements will be located on the left, and larger on the right hand side of the pivot. Then we will consider only the subarray where the k-th element is located. However the partition looks like in the quicksort algorithm:

int partition(vector<Point> & points, int l, int r, int m)
{  
   double pivot = points[m].distance;
   swap(points[m], points[r]); m = r; r--;

   while(l < r)
   {
      if(points[l].distance >= pivot){
         swap(points[l], points[r--]);
      }
      else l++;
   }
   swap(points[l], points[m]); m = l;
   return l;
}

The difference between quickselect and quicksort lays in the calling procedure, that in case of quicksort is called for both parts of the array, but in case of quickselect only for the part were the elements of interest is located:

Point & quickselect(vector<Point> points, int l, int r, int k)
{
    int m = (l + r) / 2;
    m = partition(points, medium);
    if(m == k)
         return points[m];
    if(pivot < K) 
        return findKClosestPoint(points, m + 1, r, k);
    else 
        return findKClosestPoint(points, l, m - 1, k);
}

This algorithm has O(n) complexity and doesn’t require extra space, so it’s quite efficient.

Best Regards

Joe Kidd

Advertisements