# Array pivoting

In the quicksort algorithm the most important part is the partition step. The implementation of it, is actually not difficult, however some implementation store data on the both sides of the array, instead of just one, what leads to the more complex and error prone algorithms. In this article I would like to show that for n-partition, we need n – 1 storage indexes.

In the pivot-partitioning we just move the smaller than pivot elements on the begining of the array. We can imagine it as a stack that is growing within the array size, from the bottom to the top. Secondly, to trace the top of the stack we use a variable called storagePlace, that indicates the size of the stack and the next place where to insert an element.

```void partition(std::vector<int> & v){
int pivot = v[0]; // simplified
swap(v[0], v[v.size() - 1]);
unsigned int storagePlace = 0;
for(int i = 1; i < v.size() - 1; i++)
if(v[i] < pivot)swap(v[i], v[storagePlace++]);
swap(v[storagePlace], v[v.size() - 1]);
}
```

There is a famous problem of dutch flag described firstly by Dijkstra. Imagine that we have 3 kind of elements, let’s say negative numbers, 0s, positive numbers. We want to partition the array in the way that negative numbers go first, then 0s and finally positive. A while ago we mentioned that we can imagine pivoting as maintaining a stack within the array. We extend this idea to maintain two stacks now: one grows from the bottom to the top, and the other from the top to the bottom. The former stores negative numbers, the latter positives.

```void dutch_flag_partition(std::vector<int> & v){
int top = v.size() - 1;
int bottom = 0;
for(int i = 0; i <= top; i++){
if(v[i] < 0)swap(v[bottom++], v[i]);
else if(v[i] > 0)swap(v[top--], v[i]);
}
}
```

In this case we might be sure that the elements between [0, bottom) are already the negative numbers, between [bottom, i) we have only 0, between (top, v.size()) we have positive elements. The uknown part remains between [i, top].

Both solution offer O(n) complexity for the problem. In the second case there is no stable solution that would have O(n) complexity. However if you need to keep it stable, take a look at this article.

# Median of two sorted arrays

We are given two arrays of integers and both of them are sorted. Find the algorithm to determine the median faster than O(N log N).

Let the arrays be called A1 and A2, and have size N and M respectively. If we would put the numbers into one array, then we would know that the median will be located at (N + M + 1)/2 if N + M is odd, otherwise we would take the ((N + M)/2 + (N + M + 1)/2)/2. However we won’t put them into one array. Instead we will browse them simultaneously.

```int median(vector<int> a1, vector<int> a2){
int i1, i2;
i1 = i2 = 0;
int merged_size = a1.size() + a2.size();
int last, last_last;
last = max(a1[0], a2[0]);
last_last = min(a1[0], a2[0]);

while(i1 + i2 < ((merged_size + 1)/2)){
last_last = last;
if(i1 < a1.size() && a[i1] < a[i2]){
last = array[i1];
i1++;
}
else if(i2 < a2.size()){
last = array[i+1]
i2++;
}
}
if(merged_size % 2)
return last; // the middle element
else return (last + last_last)/2;
}
```

Thanks to using two extra variables we didn’t have to introduce stack or any other structure that would monitor the order of elements. Thanks to that the memory complexity of the problem is constant.

# Reservoir Sampling

There is a huge set S of numbers of size N that can’t fit in the memory. Choose K of elements from the stream randomly, with equal probability of being selected.

Without to much introduction, let’s just say that the answer to the problem is Reservoir Sampling. An example code could look like:

```std::vector<int> reservoir_sampling(Stream S; unsigned long long N, unsigned int K){
unsigned long long i = 0;
std::vector<int> reservoir;

// fill the reservoir
for(; i < K; ++i)
reservoir.push_back(S.next());

// replace if needed
for(; i < N; ++i){
int next_token = S.next();
unsigned int r = rand() % i;
if(r < K)reservoir[i] = next_token;
}
return reservoir;
}
```