Archive for the month “November, 2014”

Quick sort experiments

Today I spared sometime running some experiments around quick sort just for fun.

I implemented one by myself recursively, found another in-place one here which I name it InternetQuickSort as one Google Search returns almost this same implementation (I am not certain about the intellectual property ownership here though) and the third one is using our DualPivot Quick Sort under Arrays java utility class. About my implementation, it is a dump one as I have no intention to compete with other which got spent a huge amount of time of many smart people in order to optimize it, my implementation is only served as an example of how bad it is and in order to see how good the other two are.

In psuedo code, my implementation is like


# Step 1: Pick a pivot value by getting the mean of three elements at startingPos, endingPos and mid point

# Step 2: Loop from startingPos to endingPos and form two arrays, one with elements greater than pivotVal and one with element smaller than pivotVal

# Copy back the two arrays into the original arrays

# Recursively call MySort on two sub arrays by changing startingPos and endingPos

Below are the experiment setup

System: runs on a Intel Core i5 3Ghz x 4, 64 bits, Java 7

Input: My experiments are conducted on arrays size starting of 250,000 element and incremental step of also 250,000 element, 25 steps


Array length
DualPivotQS MyQS InternetQS
250000 56.286573 369.059243 30.143644
500000 147.527238 417.64615 237.833958
750000 213.578431 463.814259 108.854696
1000000 293.005232 551.036705 214.40796
1250000 341.151265 1342.6494 260.471331
1500000 424.597431 755.479901 357.659643
1750000 506.3088 849.871125 399.342371
2000000 627.375886 2165.276304 419.702276
2250000 783.32427 1318.824876 542.290206
2500000 838.320689 1355.4506 627.090102
2750000 903.471969 3194.246002 758.326834
3000000 1098.818221 1533.688473 734.657916
3250000 1033.541113 1962.902445 931.064769
3500000 1178.640748 1948.658469 1192.178009
3750000 4717.532571 2049.95972 1107.368695
4000000 1340.011815 5101.839197 1174.539527
4250000 1612.278331 2317.015331 1211.459723
4500000 1361.536276 2795.669291 1289.295309
4750000 1499.520603 2789.735822 1578.865415
5000000 1560.867094 3030.472058 1729.399364
5250000 1621.528772 3191.477809 1639.488514
5500000 2022.796537 3879.536725 1841.024578
5750000 1907.504151 3566.779162 1939.123883
6000000 1878.408871 3691.383492 2343.128945

I have made a simple chart here QuickSortBenchmark

Interestingly, most of the case the InternetQuickSort version outperforms the DualPivotQuickSort but they are quite close to each other. My implementation doesn’t give a clear trend with quite large fluctuation range. However both mine and Dual Pivot QuickSort have a spike in performance when the number of element around 4 millions while the Internet Quick Sort version doesn’t encounter that at all.

Will have some more update about this.


Swap values experiments on using different approaches

Folks, we all know one typical interview question around swapping values for two variables. And maybe all of us remember by heart that there are multiple ways to do so.

The most straightforward way which everyone must know during school is using a temp variable.

Such as:

int a = 10, int b = 5;

int tmp = a;

a = b; b = tmp;

The second way is to use arithmetic operations such as addition/subtraction or multiplication/division

Code is like

int a = 10; int b = 5;

a = a + b; b = a – b; a = a – b;

The good thing is we don’t need to use a temp variable here and it means less memory/register used.

Same logic goes for multiplication and division. However, another interesting way which is often asked is using XOR.

a = a ^ b; b = a ^ b; a = a ^ b;

Three assignments with same expression. Nothing easier to remember.

But it seems few people have ask which one is better except the memory assigned for the extra variable. I did some simple benchmark today and would like to share with you guys. In a block of 100k “swap” operations, I increment it one block at a time and measure the time it takes using three different above approaches.

Set up:

System: runs on a Intel Core i5 3Ghz x 4, 64 bits, Java 7

Input are all primitive integers, time measure in nano second. Before running, did some “warm up” for JVM to be ready.

What I found out is there are no significant difference between using a tmp variable and using arithmetic operation. The time taken fluctuates around 300 ns regardless of the number of block of operation (be it 100k swap or 50 times more).

Using XOR, a clearly linear trend in the time taken can be observed. Approximately 230k ns is needed for one block (100,000 swaps).  This is 1000 times as slow as other traditional methods.

Conclusion: Using XOR approach is a crappy theory which is only valid in interview/text book context!

Attaching a graph of the result swap_experiment

Welcome any comments or question.

Post Navigation