Sort Linked Lists 10X Faster Than MergeSort 326
virusfree tells us about a new algorithm that has been developed that the author claims can sort a linked list up to 10 times faster than MergeSort. "BitFast," a member of the Hash algorithms family, is available in C and C++ under the GPL.
Here is a good Mergesort implementation (Score:1, Interesting)
For a pretty efficient mergesort implementation in C/C++, try the following (has at one time been offered to libg++, but not taken up). It sorts, as an example, stdin line by line to stdout. The whole recursion stuff has been flattened into loops and binary arithmetic.
<blockquote>
#include <cstddef>
#include <iostream>
#include <string>
using namespace std;
template <class T,class V> class listsort : V {
public:
listsort(const V &arg) : V(arg) { }
T * &sort(T* &headin, size_t n) const
{
size_t underpow,n1,n2,bitrev;
int lev, kmax;
T **head = &headin;
T **headstack[CHAR_BIT*sizeof(size_t)];
T **tail;
T **tail1;
T *p1;
T *p2;
int sp = 0;
switch (n)
{
case 0:
return *head;
case 1:
return getlink(*head);
}
for (underpow = 1, kmax=0; underpow <= (n-1)/4;) {
underpow *= 2;
++kmax;
}
for (lev = kmax, bitrev=underpow, n -= 2*underpow;;) {
p1 = *head;
tail = &getlink(p1);
p2 = *tail;
if (listleq(p1,p2)) {
tail = &getlink(p2);
} else {
*tail = getlink(p2);
getlink(p2) = p1;
*head = p1 = p2;
}
switch ((bitrev+n)>>lev) {
case 2:
p2 = *tail;
if (listleq(p1,p2)) {
tail1 = &getlink(p1);
p1 = *tail1;
if (listleq(p1,p2)) {
tail = &getlink(p2);
break;
}
*tail1 = p2;
} else
*head = p2;
*tail = getlink(p2);
getlink(p2) = p1;
break;
case 3:
headstack[sp++] = head;
head = tail;
++lev;
continue;
}
for (;;) {
if (lev == 0)
return *tail;
if (!((bitrev>>--lev)&1))
Re:multithreaded merge (Score:3, Interesting)
The only problem I forsee is the case where the added overhead exceeds the gain. However, I imagine this would only be the case when you are sorting very small lists. Perhaps one could check the size of the list first and then decide whether or not forking is worth it.
Re:radix sort vs. comparison sort (Score:4, Interesting)
GPL-ed algorithm? (Score:3, Interesting)
Please correct me if I'm mistaken, but this seems like a fairly bad precedent to set.
Re:Only lists of integers (Score:4, Interesting)
Re:Did KD set this guy up for ridicule on purpose? (Score:3, Interesting)
Re:int main (Score:2, Interesting)
Messy and inconsistent indenting (check c/ and c++/ files for differences)
Difference between c/* and c++/* is that c++/ has cout instead of printf()... well done...
This stuff should go to worsethanfailure.com (was TheDailyWTF.com)
Re:wtf? seriously. (Score:3, Interesting)
Agreed. For example, in a microcode example at work, the cost of searching for a given string was totally dominated by memory access time, so it really didn't matter which algorithm we used. So we used brute force. Pretty easy to debug, as well.
Memory leaks... (Score:4, Interesting)
long *Tail = new long[65535];
long *Head = new long[65535];
for(n = 0; n < 65535; n++){ Head[n] = 0; Tail[n] = 0; }
Memory leak of 128KB for each time the function "Node* BitFast(Node* HeadNode,long run)" is called.
MISSING: delete[] Tail; delete[] Head;
Furthermore the following is faster or use memset:
long Tail[65535] = {0};
long Head[65535] = {0};
Unless you are running this in DOS16, DOS32 with Pharlap or smaller 8051, 68000 processor, you do not need to use new[] or malloc() for this.
In InsertToList, Node *tmpP = (Node*)malloc(sizeof(Node));
free(tmpP) missing inside FreeList()... no such function
In main.c:
Node *L1 = malloc(sizeof(Node));
Node *L2 = malloc(sizeof(Node));
MISSING: free(L1); free(L2);
In main.cpp:
Node *L1 = new Node;
Node *L2 = new Node;
Instead use (no need to use heap, use stack memory):
Node L1 = {0}, L2 = {0};
Mixing the usage of std::cout and printf() is not recommanded.
Use all of the former or the later, else you will have weird display on some run, unless you flush the output buffer. I suggest you use printf() all the way.
cout << "Ok!!!!" << "\n\n" << "Merge Sort : " << tim1 << " ms\n" << "BitFast : " << tim2 << " ms\n\n";
becomes:
printf("Ok!!!!\n\nMerge Sort : %ld ms\nBitFast : %ld ms\n\n", tim1, tim2);
Furthermore, your implementation of link list, and calling Length(Node*) for every equal is highly inefficient, requires O(n). Store the link list length somewhere when inserting. EqualSplit takes O(1.5n) instead of O(0.5n) because of that.
Something like:
typdef struct LinkList {
Node head;
int length;
} LinkList;
As a result, MergeSort recursively call itself, which calls EqualSplit every turn.
You should make your header file C friendly, and avoid mixing
No clue why this header is needed:
#include <iso646.h> instead of not/and use use ! and &&
before: if(not p) return NULL;
becomes: if(!p) return NULL;
Finally, BitFast is not recursive; therefore, you should try to find an implementation of merge sort which is not recursive also, so you save on function calls, if possible.
"However, iterative, non-recursive, implementations of merge sort, avoiding method call overhead, are not difficult to code." -- Wikipedia
http://en.wikipedia.org/wiki/Merge_s
Your function should be in a file.c or be inline.
Multiple bugs in the code, wrong measurements ! (Score:5, Interesting)
First, there are multiple bugs in the code itself :
1) everywhere, it is assumed that 2^16 = 65535 and not 65536. Arrays are allocated for this size too, but are used with 65536 entries. (un)fortunately for him, the space is probably allocated anyway so the program does not crash on this.
2) a similar bug in the random number generator carefully avoids dangerous values, because he generates numbers between min and max by doing a modulo on max-min , which prevents max from being produced. This is not a
big deal anyway, just that it hides the bug.
3) he assumes everywhere that "long" is unsigned and "short int" too. This makes the code SegFault on Linux with negative array indexes. The fix is obvious, though.
And most of all : times are measured in optimal conditions. Indeed, this algorithm is
just a radix sort with a large radix (2^16). A radix algorithm is efficient when the number
of elements is high compared to the radix, because at every level, all radix entries have to
be scanned for existing values. In his example, he sorts 4 million elements. His code is faster
than the merge sort on 4 million entries, just as would be any radix sort. But on smaller sets,
it still takes the same time per level, because it has to scan all 65536 entries even if there
are only 100 numbers. The fact is that for less than 10000 values, the mergesort becomes faster in his own code.
When measuring algorithms complexity, we use the O(x) notation. The time per iteration is never
considered. When the time matters (as in his example), we should consider the time per iteration (T), and the total time (t) will follow
t = T . complexity
In complexity, his algorithm is O(N) on 16 bits, but the time per iteration is very high. Other sorting algorithms are O(N.log(N)). So the total time is :
t1 = T1 . O(N) for his algorithm, and :
t2 = T2 . O(N.log(N)) for the mergesort.
But when T1 > T2.log(N), he looses, which is almost always because log(N) is small for 16 bits, and T2 too because operations are very simple. In his case, T1 is very high because it represents the time required to scan 65536 values.
It's amazing that he fell in such a beginner's trap. I suppose that he's still a student, which would explain the total lack of portability and newbie's bugs in the code.
Well tried anyway
Willy
Re:radix sort vs. comparison sort (Score:3, Interesting)
I don't know if you're still interested, and I'm just hypothesizing here, but 0xFFFF is 65535 and some other posters noted that his array was of size 65535, which of course means that index 65535 is out of bounds by just one. This could cause your segfault.
HTH,
Kig.
Re:wtf? seriously. (Score:1, Interesting)
Introsort is silly, and STL is dumb to use it. I am a PhD student in CS theory, and I can safely say this is unjustified use of too much theory in practice. Using introsort is no more reasonable than STL's decision to heavily favor tree-based sets/map over the much faster hash-based sets/maps.
Quicksort with random pivots is faster than heapsort. It runs in O(nlog n) time on any input with extremely high probability unless the data generator somehow outguessed your random number generator (not bloody likely in any real situation). Furthermore, even if random quicksort has performed badly so far, it IS STILL FASTER to continue with quicksort than it is to switch to heapsort because past performance is no predictor of future performance with random pivots.
So what does STL do? They use a worse version of quicksort because they are scared of the wildly crazy theoretical implications of randomness, and then have to implement yet another sort on top of it because their original algorithm sucks. Nice.
Re:How is this not a radix sort? (Score:3, Interesting)
Funny, all I saw on the "site" was a donation link or something. Looked like a parked domain or something.
But radix sorting is cool. What is very relevant to it in 2007 is that it scales superlinearly with multiple cores/CPUs which is important since most computers today have hardware that execute multiple threads simultaniously.