Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Programming

Sort Linked Lists 10X Faster Than MergeSort 326

virusfree tells us about a new algorithm that has been developed that the author claims can sort a linked list up to 10 times faster than MergeSort. "BitFast," a member of the Hash algorithms family, is available in C and C++ under the GPL.
This discussion has been archived. No new comments can be posted.

Sort Linked Lists 10X Faster Than MergeSort

Comments Filter:
  • by Anonymous Coward on Sunday February 25, 2007 @04:14PM (#18145344)
    The poster has just implemented a radix sort. One can do them quite faster, but of course, they are basically restricted to sorting numbers, or at least bit strings with a reasonable diversity in the distribution of prefixes.

    For a pretty efficient mergesort implementation in C/C++, try the following (has at one time been offered to libg++, but not taken up). It sorts, as an example, stdin line by line to stdout. The whole recursion stuff has been flattened into loops and binary arithmetic.
    <blockquote> /* Copyright (c) 2007 David Kastrup dak@gnu.org */
    #include <cstddef>
    #include <iostream>
    #include <string>
    using namespace std;

    template <class T,class V> class listsort : V {
    public:
    listsort(const V &arg) : V(arg) { }
    T * &sort(T* &headin, size_t n) const
    {
    size_t underpow,n1,n2,bitrev;
    int lev, kmax;
    T **head = &headin;
    T **headstack[CHAR_BIT*sizeof(size_t)];
    T **tail;
    T **tail1;
    T *p1;
    T *p2;
    int sp = 0;
    switch (n)
    {
    case 0:
    return *head;
    case 1:
    return getlink(*head);
    }
    for (underpow = 1, kmax=0; underpow <= (n-1)/4;) {
    underpow *= 2;
    ++kmax;
    }
    for (lev = kmax, bitrev=underpow, n -= 2*underpow;;) {
    p1 = *head;
    tail = &getlink(p1);
    p2 = *tail;
    if (listleq(p1,p2)) {
    tail = &getlink(p2);
    } else {
    *tail = getlink(p2);
    getlink(p2) = p1;
    *head = p1 = p2;
    }
    switch ((bitrev+n)>>lev) {
    case 2:
    p2 = *tail;
    if (listleq(p1,p2)) {
    tail1 = &getlink(p1);
    p1 = *tail1;
    if (listleq(p1,p2)) {
    tail = &getlink(p2);
    break;
    }
    *tail1 = p2;
    } else
    *head = p2;
    *tail = getlink(p2);
    getlink(p2) = p1;
    break;
    case 3:
    headstack[sp++] = head;
    head = tail;
    ++lev;
    continue;
    }
    for (;;) {
    if (lev == 0)
    return *tail;
    if (!((bitrev>>--lev)&1))
  • by SuperMog2002 ( 702837 ) on Sunday February 25, 2007 @04:16PM (#18145352)
    I've never seen it done before, but I don't imagine it would be particularly diffucult to pull off. I'd say just add another parameter that tells the function how deep in the call stack it is. If it's less than x layers deep, fork with the child process going on to the other processor. The parent process sorts the first half of the list, and the child process sorts the second half. When the parent process finishes, it waits for the child process to finish (if it hasn't already), then collects the child's sorted list and merges as usual. The reason you'd want to track how deep you are is so that you don't wind up forking when there aren't any more processors available. For instance, if you were running on a two processor machine, you'd only want to fork once, since forking again would only add overhead without any gain. You could probably generate x on the fly based on the number of processors in the user's computer, too.

    The only problem I forsee is the case where the added overhead exceeds the gain. However, I imagine this would only be the case when you are sorting very small lists. Perhaps one could check the size of the list first and then decide whether or not forking is worth it.
  • I ditto this. Mergesort is a true comparison sort, which means it can sort anything so long as a comparison between two values is defined, e.g. strings (in lexigraphical order), floating point numbers, 1000-digit numbers ... you get the drift. BitFast is a radix sort and only works in cases where the list elements obey certain limitations: for instance, if each list element can be expressed in 32 bits (an int).
  • GPL-ed algorithm? (Score:3, Interesting)

    by Wilson_6500 ( 896824 ) on Sunday February 25, 2007 @04:22PM (#18145400)
    I've done a bit of scientific programming, but I'm not a computer-science-oriented person by any stretch. Thus, I'm a little confused by this. Under what license are things like Quicksort and Stupidsort (for another, very bad example) "released"? Why would a person feel it necessary to release an algorithm under any kind of license at all, in fact? No textbook would want to include GPL-ed code--wouldn't that make the entire book GPL by nature of the license?

    Please correct me if I'm mistaken, but this seems like a fairly bad precedent to set.
  • by antifoidulus ( 807088 ) on Sunday February 25, 2007 @04:23PM (#18145412) Homepage Journal
    Not only integers, more than likely only POSITIVE integers. If you are using 2's complement that would mean that the first bit in all negative numbers would be a 1. Unless I missed something, this algorithm does not take that into account and would sort them as being larger than all positive numbers.
  • by grimJester ( 890090 ) on Sunday February 25, 2007 @05:06PM (#18145748)
    I think it's almost mandatory to have grammar or spelling errors in any grammar nazi post. I noticed the "pleas post" right after posting. As a newly found subset of Murphy's Law, someone should name it and become famous.
  • Re:int main (Score:2, Interesting)

    by Srdjant ( 650988 ) on Sunday February 25, 2007 @05:17PM (#18145822)
    His main BitFast code is in header files...
    Messy and inconsistent indenting (check c/ and c++/ files for differences)

    Difference between c/* and c++/* is that c++/ has cout instead of printf()... well done...

    This stuff should go to worsethanfailure.com (was TheDailyWTF.com)
  • Re:wtf? seriously. (Score:3, Interesting)

    by Ivan the Terrible ( 115742 ) <`vladimir' `at' `acm.org'> on Sunday February 25, 2007 @05:23PM (#18145882) Homepage
    > the best performing algorithms in real terms are theoretically O(n^2)

    Agreed. For example, in a microcode example at work, the cost of searching for a given string was totally dominated by memory access time, so it really didn't matter which algorithm we used. So we used brute force. Pretty easy to debug, as well.
  • Memory leaks... (Score:4, Interesting)

    by fprog26 ( 665694 ) on Sunday February 25, 2007 @05:41PM (#18146044)
    Inside bitfast.h:
    long *Tail = new long[65535];
    long *Head = new long[65535];

    for(n = 0; n < 65535; n++){ Head[n] = 0; Tail[n] = 0; }

    Memory leak of 128KB for each time the function "Node* BitFast(Node* HeadNode,long run)" is called.

    MISSING: delete[] Tail; delete[] Head;

    Furthermore the following is faster or use memset:
    long Tail[65535] = {0};
    long Head[65535] = {0};

    Unless you are running this in DOS16, DOS32 with Pharlap or smaller 8051, 68000 processor, you do not need to use new[] or malloc() for this.

    In InsertToList, Node *tmpP = (Node*)malloc(sizeof(Node));
    free(tmpP) missing inside FreeList()... no such function

    In main.c:
    Node *L1 = malloc(sizeof(Node));
    Node *L2 = malloc(sizeof(Node));
    MISSING: free(L1); free(L2);

    In main.cpp:
    Node *L1 = new Node;
    Node *L2 = new Node;

    Instead use (no need to use heap, use stack memory):
    Node L1 = {0}, L2 = {0};

    Mixing the usage of std::cout and printf() is not recommanded.
    Use all of the former or the later, else you will have weird display on some run, unless you flush the output buffer. I suggest you use printf() all the way.
    cout << "Ok!!!!" << "\n\n" << "Merge Sort : " << tim1 << " ms\n" << "BitFast : " << tim2 << " ms\n\n";
    becomes:
    printf("Ok!!!!\n\nMerge Sort : %ld ms\nBitFast : %ld ms\n\n", tim1, tim2);

    Furthermore, your implementation of link list, and calling Length(Node*) for every equal is highly inefficient, requires O(n). Store the link list length somewhere when inserting. EqualSplit takes O(1.5n) instead of O(0.5n) because of that.

    Something like:

    typdef struct LinkList {
      Node  head;
      int   length;
    } LinkList;

    As a result, MergeSort recursively call itself, which calls EqualSplit every turn.

    You should make your header file C friendly, and avoid mixing // with /* */, malloc and new[]

    No clue why this header is needed:
    #include <iso646.h>  instead of not/and use use ! and &&
    before:  if(not p) return NULL;
    becomes: if(!p) return NULL;

    Finally, BitFast is not recursive; therefore, you should try to find an implementation of merge sort which is not recursive also, so you save on function calls, if possible.

    "However, iterative, non-recursive, implementations of merge sort, avoiding method call overhead, are not difficult to code." -- Wikipedia

    http://en.wikipedia.org/wiki/Merge_so rt

    Your function should be in a file.c or be inline.
  • by wtarreau ( 324106 ) on Sunday February 25, 2007 @06:34PM (#18146482) Homepage
    Don't use this code !

    First, there are multiple bugs in the code itself :

    1) everywhere, it is assumed that 2^16 = 65535 and not 65536. Arrays are allocated for this size too, but are used with 65536 entries. (un)fortunately for him, the space is probably allocated anyway so the program does not crash on this.

    2) a similar bug in the random number generator carefully avoids dangerous values, because he generates numbers between min and max by doing a modulo on max-min , which prevents max from being produced. This is not a
    big deal anyway, just that it hides the bug.

    3) he assumes everywhere that "long" is unsigned and "short int" too. This makes the code SegFault on Linux with negative array indexes. The fix is obvious, though.

    And most of all : times are measured in optimal conditions. Indeed, this algorithm is
    just a radix sort with a large radix (2^16). A radix algorithm is efficient when the number
    of elements is high compared to the radix, because at every level, all radix entries have to
    be scanned for existing values. In his example, he sorts 4 million elements. His code is faster
    than the merge sort on 4 million entries, just as would be any radix sort. But on smaller sets,
    it still takes the same time per level, because it has to scan all 65536 entries even if there
    are only 100 numbers. The fact is that for less than 10000 values, the mergesort becomes faster in his own code.

    When measuring algorithms complexity, we use the O(x) notation. The time per iteration is never
    considered. When the time matters (as in his example), we should consider the time per iteration (T), and the total time (t) will follow :

          t = T . complexity

    In complexity, his algorithm is O(N) on 16 bits, but the time per iteration is very high. Other sorting algorithms are O(N.log(N)). So the total time is :

          t1 = T1 . O(N) for his algorithm, and :
          t2 = T2 . O(N.log(N)) for the mergesort.

    But when T1 > T2.log(N), he looses, which is almost always because log(N) is small for 16 bits, and T2 too because operations are very simple. In his case, T1 is very high because it represents the time required to scan 65536 values.

    It's amazing that he fell in such a beginner's trap. I suppose that he's still a student, which would explain the total lack of portability and newbie's bugs in the code.

    Well tried anyway :-)

    Willy
  • by kigrwik ( 462930 ) on Monday February 26, 2007 @04:30AM (#18150282)
    Disclaimer: I didn't read the code, and don't intend to.

    I don't know if you're still interested, and I'm just hypothesizing here, but 0xFFFF is 65535 and some other posters noted that his array was of size 65535, which of course means that index 65535 is out of bounds by just one. This could cause your segfault.

    HTH,
    Kig.

  • Re:wtf? seriously. (Score:1, Interesting)

    by Anonymous Coward on Monday February 26, 2007 @04:52AM (#18150372)
    "In fact some of the best performing algorithms in real terms are theoretically O(n^2) in the worst case, hence the creation of hybrids like introsort, a quicksort variant designed to avoid the worst case by switching to heap sort after a certain recursion depth is reached."

    Introsort is silly, and STL is dumb to use it. I am a PhD student in CS theory, and I can safely say this is unjustified use of too much theory in practice. Using introsort is no more reasonable than STL's decision to heavily favor tree-based sets/map over the much faster hash-based sets/maps.

    Quicksort with random pivots is faster than heapsort. It runs in O(nlog n) time on any input with extremely high probability unless the data generator somehow outguessed your random number generator (not bloody likely in any real situation). Furthermore, even if random quicksort has performed badly so far, it IS STILL FASTER to continue with quicksort than it is to switch to heapsort because past performance is no predictor of future performance with random pivots.

    So what does STL do? They use a worse version of quicksort because they are scared of the wildly crazy theoretical implications of randomness, and then have to implement yet another sort on top of it because their original algorithm sucks. Nice.
  • by hackstraw ( 262471 ) on Monday February 26, 2007 @09:13AM (#18151802)

    Funny, all I saw on the "site" was a donation link or something. Looked like a parked domain or something.

    But radix sorting is cool. What is very relevant to it in 2007 is that it scales superlinearly with multiple cores/CPUs which is important since most computers today have hardware that execute multiple threads simultaniously.

Scientists will study your brain to learn more about your distant cousin, Man.

Working...