Sorting letter blocks  I was looking into the concept of partial sorting, something that would help in a scenario where you want the k smaller or bigger items from an array of n items and k is significantly smaller than n. Since I am tinkering with LInQer, my implementation for LINQ methods in Javascript, I wanted to tackle the OrderBy(...).Take(k) situation. Anyway, doing that I found out some interesting things.

  First, the default Javascript array .sort function has different implementations depending on browser and by that I mean different sort algorithms. Chrome uses insertion sort and Firefox uses merge sort. None of them is Quicksort, the one that would function best when the number of items is large, but that has worst case complexity O(n^2) when the array is already sorted (or filled with the same value).

I've implemented a custom function to do Quicksort and after about 30000 items it becomes faster than the default one.  For a million items the default sort was almost three times slower than the Quicksort implementation. To be fair, I only tested this on Chrome. I have suspicions that the merge sort implementation might be better.

  I was reporting in a previous version of this post that QuickSort, implemented in Javascript, was faster than the default .sort function, but it seems this was only an artifact of the unit tests I was using. Afterwards, I've found an optimized version of quicksort and it performed three times faster on ten million integers. I therefore conclude that the default sort implementation leaves a lot to be desired.

  Second, the .sort function is by default alphanumeric. Try it: [1,2,10].sort() will return [1,10,2]. In order to do it numerically you need to hack away at it with [1,2,10].sort((i1,i2)=>i1-i2). In order to sort the array based on the type of item, you need to do something like: [1,2,10].sort((i1,i2)=>i1>i2?1:(i1<i2?-1:0)).

  Javascript has two useful functions defined on the prototype of Object: toString and valueOf. They can also be overwritten, resulting in interesting hacks. For some reason, the default .sort function is calling toString on the objects in the array before sorting and not valueOf. Using the custom comparers functions above we force using valueOf. I can't test this on primitive types though (number, string, boolean) because for them valueOf cannot be overridden.

  And coming back to the partial sorting, you can't do that with the default implementation, but you can with Quicksort. Simply do not sort any partition that is above and below the indexes you need to get the items from. The increases in time are amazing!

  There is a difference between the browser implemented sort algorithms and QuickSort: they are stable. QuickSort does not guarantee the order of items with equal sort keys. Here is an example: [1,3,2,4,5,0].sort(i=>i%2) results in [2,4,0,1,3,5] (first even numbers, then odd, but in the same order as the original array). The QuickSort order depends on the choice of the pivot in the partition function, but assuming a clean down the middle index, the result is [4,2,0,5,1,3]. Note that in both cases the requirement has been met and the even numbers come first.

Comments

Be the first to post a comment

Post a comment