Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Technology

Reconfigurable Supercomputers 181

VanL writes " A previously unknown company has come up with a supercomputer design using programmable logic components. If this is right, this might be another case of a garage inventor changing the computing paradigm. In the meantime, they are claiming incredible things from their demo computer: It is the worlds fastest computer (3-4x faster than IBM's Pacific Blue, and 10x faster than a Cray); it is fault-tolerant enough that you could shoot a bullet through it and it would keep on working; it will run any operating system out-of-the-box; and it is the size of a normal desktop computer and runs off household current. They call it HAL. ;) Check out the press release, a news story, and a more detailed description of the company and the technology here."
This discussion has been archived. No new comments can be posted.

Reconfigurable Supercomputers

Comments Filter:
  • by Anonymous Coward
    From the Utah News...

    'This once unheard-of company expects to be "heard of" this week, now that it has unveiled Hal for all to see. But its address is a secret, for security reasons.

    When you have a computer that's worth more than its weight in solid gold, you lock it up tight, every night.'

    From Whois query on starbridgesystems.com...

    Registrant:
    Circa 65 (STARBRIDGESYSTEMS-DOM)
    208 1/2 25th Street
    Ogden, UT 84401
    US

    Domain Name: STARBRIDGESYSTEMS.COM

    Administrative Contact, Technical Contact, Zone Contact:
    Light, Doug (DL8191) dlight@LGCY.COM
    (801)994-7337 (FAX) (801)994-7338
    Billing Contact:
    Gleason, Matt (MG11172) MGLEASON@CIRCA65.COM
    (801)392-2600 (FAX) (801)392-6350

    Record last updated on 17-Jan-99.
    Database last updated on 9-Feb-99 15:17:13 EST.

    Domain servers in listed order:

    NS1.WHATEVERWEB.COM 209.160.196.62
    NS2.WHATEVERWEB.COM 209.160.196.59
  • Try looking where the company is located
  • it's either and early April 1st joke or we have a new computing science. I hope it's the latter. :/
  • it's either an early April 1st joke or we have a new computing science. I hope it's the latter. :/
  • Posted by Buffy the Overflow Slayer:

    But does it run Win 98?
  • Posted by Parcells:

    I'll take two please. Is this for real or is it an April Fools joke two months early?
  • Posted by His name cannot be spoken:

    Hey, I have an interesting idea!

    Lets grab one, load on that crypto software that the girl in Ireland wrote, and toss on that mystical compression that compresses files into 256 bytes( from waaaay back.. ) and add to that the secret of how to get the caramel into the caramilk bar.

    Sure bet the aliens will come back and give us the keys to the pyramids, so we can find out what the fsck the 11 herbs and spices are in Kentucky fried chickens receipe!

    Sheeesh!

  • Posted by twi:

    > BTW, if you find a metalanguage with all possible tasks you could do on a processor, let me know.

    What about turing-machines ? At least if you mean processor in our traditional sense.
  • Posted by twi:

    Although I know next to nothing about this stuff and although what they say sounds a lot like a hoax, the principle strikes me as a typical case of "why did I never think about that". I instantly liked it :)
    Sure, a "C++ to FPGA"-compiler would be a bit too complex to imagine, but if you find the means to
    create good circuits from an algorithm-description, why not ?
    And as for the speed-increase, think about a simple AND-operation. To do it on a conventional machine you load the instruction, decode it and execute it. Although that may run in on "cycle" it surely involves a lot of gate-switches. Hardcoding this AND on the FPGA takes the time of.. well, an AND. The time needed for one gate (or array, for register-AND) to calculate this operation is all it takes. It's not hard to believe that this is many times faster, even if the FPGA in itself is not as fast as a custom made "real" chip.
    Hardware is always faster than software. 3D-grafics-chips do exist for a reason, don't they ? ;)
    And stuff like i/o-bandwith or context-switching might not be that important for a "supercomputer" which will probably not be targeted as your average multitasking-unix-box, even if it could fit on the desktop.
    Being reprogrammable "only" 1000 times a second also is no problem at all, because you can leave one part of it running as a general-purpose-cpu all the time. Slow reprogramming is not a loss for its slowness, but a win for its reprogrammability.

  • Posted by pudi:

    I don't suppose they run a conventional OS. But does it run any OS at all? Can it be used as a general purpose supercomputer? It sounds like a super-calculator to me.
  • Interesting how they compare 16 bit addition (for their system) to IEEE floating point on a Cray. Also of note, 50ns main memory? SDRAMS are faster than that.

    As to their comparison of computing on their system vs another supercomputer, Most analysts don't re-build the computer to do a program run.

    There are some interesting ideas there, but I'll bet that real world benchmarking won't look nearly as good as their estimates.

  • My warning bells started clanging at this point:

    6. Superspecificity is also achieved by SBS through advanced artificial intelligence algorithms and techniques such as recursion, cellular autonoma, heuristics and genetic algorithms, which are incorporated into a single system which naturally selects the most efficient library element for achieving maximum specificity.

    Buzzword alert! Buzzword alert! He forgot "neural nets" though. I'm sure they're in there.

    7. Higher orders of specificity are also possible because SBS's Hypercomputer system is self-recursive. It uses its own algorithms to evolve itself. The system is capable not only of producing systems that are more simple than itself, it is also capable of producing systems that are more complex than itself.

    Cool, they've hit that holy grail of science fiction, the self evolving computer. All you have to do now is build a pair of robotic arms for it and it will build itself into a gigantic Übercomputer and take over the universe.

    Oh, and my favourite part:

    9. Because the Viva software system includes a formally accurate method for achieving an optimal solution to a problem, the layering of those optimal solutions is also optimal....

    It looks like they've solved that nasty "computer science" problem we've all been working on! No more slaving over algorithm design! Just type in the problem (in English I assume) and this thing will solve it optimally.

  • Sun already has boxes that approach those speeds and shortly will be beyond that. And you can shoot a bullet through an e10000 as long as you don't hit anything vitally important (like going through all the power supplies for instance) and it will still run. Not so very different from this 'discovery'.

    Still, interesting.
  • Show me a LINPACK 1000x1000 or SPECfp95 benchmark and I'll *think* about considering this thing a supercomputer. Till then it's just hype.

    I wonder which Cray they're comparing it to. There's more than one, after all. They're probably comparing to something slow like a Y/MP or a J90. They might look stupid if they compared to a T90 or SV1...

    --Troy
  • The Cray T3E #1024

    per their about us page

    Well, that tells me a little more... but not much. There are 3 different models of that particular machine, depending on whether it uses 300, 450, or 600MHz Alphas. Based on the 1 TFLOP number they quote, I'm guessing they mean the model with the 600MHz chips (the T3E-1200E/1088).

    I liked the following little piece of idiocy from SBS's "About us" page [starbridgesystems.com]:

    The Cray machine, which costs approximately $76 million, can perform at a peak speed of one trillion instructions per second in a narrow class of applications. It cannot sustain performance at peak speed.

    Well, duh!!! Of course it can't sustain peak! There's no machine in the world that can sustain more than about 50% of peak performance on useful, real world code; usually that number's closer to 25-30% of peak. Unless SBS knows something serious about compiler technology that the rest of us haven't figured out yet, sustaining a system's peak performance is impossible.

    Their 12 teraop number is very suspicious, too. They define an "op" as a 4-bit integer add. The ops they quote on the T3E are 64-bit floating-point adds and multiplies. Apples and apples? I don't think so. There aren't too many interesting problems you can do with 4-bit integers, either; maybe extremely lossy signal processing, but that's about it.

    I also noticed that SBS's press release page [starbridgesystems.com] has been taken down some time in the last day or so... I'd love to believe these guys have some kind of breakthrough, but from everything I've seen they're either extremely naive or lying out their collective butts.

    --Troy
  • It may be fast, but can it crack RC5? Just imagine that machine on team slashdot!
    ----------
  • Even ultra-secretive Transmeta had to get some patents and some recruiting, and that got some attention. How did this big a breakthrough happen in such secrecy?
    I truly hope it's for real. I want to believe. But why is my baloney detector ringing?
    Let's hope my detector is faulty, shall we?
    --
  • Look carefully (and subjectively, how rarely that happens) at the performance specifications. The IBM Pacific Blue did 1.2 TeraOps sustained peak running the actual ASCI codes (albeit in their own labs, SGI beat those numbers and did it on site). The SBS HAL-4rW1 did 12.8 TeraOPs doing a sequence of 4 bit additions, or 3.8 TeraOPs on a 16 bit adder.

    This means that the memory and I/O subsystems aren't even exercised. Nobody uses a 4 bit addition as a performance spec, not even Intel.

    The actual product description is unbelievable as well. The largest Xilinx FPGA's might be capable of being configured to fully emulate a 16 bit microprocessor. I haven't worked with them in a long time but when I worked with the 4000 series I figured I could shoehorn a rudimentary 8 bit processor into the largest devices. (which would mean that a rudimentary 8 bit microprocessor was produced for over one thousand dollars incidently. It's a bit cheaper to buy a PIC from MicroChip)

    They said that they reached these performace levels with 280 of the largest Xilinx FPGA's. My take on what they've done is cram as many 4 bit adders onto a single FPGA and replicate it 280 times. They then had them all execute in parallel and pretended that this made up a supercomputer.

    Keep in mind that performance on an FPGA isn't stunning. We're talking on the order of 10 nanoseconds to do the 4 bit addition.

    So... if they've even designed and built this thing (which I doubt) the specifications are a complete fabrication.

    I haven't checked yet, but browse through Xilinx's web site [xilinx.com]. If they don't mention this wonder of reconfigurable computing then it doesn't exist.
  • For those of who don't know what an FPGA is, they are chips that can be dynamically reconfigured to do a different purpose many times a second. A nice example that I once saw was a video deocding chip. It was a simple chip that reprogrammed itself several times for each frame. First to get the frame, second to decode the frame, third to display the frame, fourth to decode the sound, fifth to play the sound. And most of the stuff was done entirely in hardware. Becuase the fact that most of the decoding was hardware, it was extremely fast for the clock rate.

    This could be something along those lines. However, I must admit that I'm a little bit skeptical.
  • for a moment that this is real, and this box can
    run linux. Then imagine enough of them to take
    up the space of ibm's box, all beowulf'd together.... BWHAHAHAHAHAH!
  • I wanna get 4 of these babys and set up a Beowolf cluster... then we need to port some games... :)
  • Is it just me or did we jump forward a few months to April... I see it is actually a time machine it is so fast that it has actually jumped back in time from April 1st to now...

  • The whole thing is written in microsoft word and then "save as html"ed. THe author is clueless enough to belive microsofts claim that word supports HTML. It doesn't. You will notice that his numbered lists, have all the points as "1." and his tables don't work either.
  • I can understand how, if you were able to determine what you wanted out of your FPGAs fast enough, you could build a VERY fast machine. My problems are:

    1) How the heck is he reloading the FPGAs fast enough to be all that hot? I've not worked with Xilinx parts lately, but the Altera (one of Xilinx's competitors in the FPGA market) stuff I have used takes many milliseconds to reload. Mabey this is a feature of the Xilinx architecture, but that leads to my second problem:

    2) How the heck did he get enough information out of Xilinx to write his own compiler? Again, I've never tried, but when I asked Altera for information about the internal structure of their parts, I was told that was proprietary. Since the structure of the chips is the thing these companies are trading on, they are usually pretty closed mouth about this sort of thing.

    3) Why isn't there an announcement on the Xilinx web site? If I were Xilinx and someone used my gear to beat Pacific Blue, I'd be shouting it from the rooftops, and trying to drum up buisness with it.

    I wonder if they are busy signing up investors as we type?

    Michael Kohne
    mhkohne@discordia.org
  • This appears to be one of their patents - can anyone find any others?

    http://www.patents.i bm.com/details?pn=US05600845__&language=en [ibm.com]

  • err, Gilson :-)
  • They spelled scalar wrong:

    In addition to all of the tasks traditionally performed by supercomputers, SBS's Hypercomputer systems can perform the full range of functions requiring ultra-fast scaler processing, such as...

    There, you anti-spelling flamers, is a great example of how spelling _matters_. The BS meter
    goes off even louder when you can't spell your wild claims about your invention.
  • If it does run Windows 98, it will crash 60,000 times faster :)

    Kythe
    (Remove "x"'s from
  • When am I going to see a Quake 2 timedemo on this thing?
  • ...and call it "Take it with a grain of" Salt Lake City?
    They do have an economy model for only 2 million. Maybe one of the AC's who's as rich as he are brilliant will put one on their AMEX card and report back to us on how well it works. Be the first one on your block. Operators are standing by.
    (I've still got the articles on building that RCA computer around here somewhere, BTW, but I was strictly into analog back then).
  • ...and call it "Take it with a grain of" Salt Lake City?
    They do have an economy model for only 2 million. Maybe one of the AC's who're as rich as they are brilliant will put one on their AMEX card and report back to us on how well it works. Be the first one on your block. Operators are standing by.
    (I've still got the articles on building that RCA computer around here somewhere, BTW, but I was strictly into analog back then).
  • Its not physically possible to reconfigure a Xilinx fpga 1000 times a second. The only ones that come close are low-end 3000 series, which are both hard to come by and not all that usefull due to their small size. A 4000 series FPGA takes on the order of 50ms to reprogram. Which means it could be reprogrammed at most 20 times a second. With 250 fpga's, you could easily get 1000 *total* reprogrammings a second, but there's no reason for a 1ms timeslice on reprogramming then.

    But even then, this all assumes that you have the bitstreams already available to download to the fpga. Dynamically creating them is not too simple - just calculating the routing can take hours on a p2-300. So that means you have to have a pre-compiled set of bitstreams, which could be reconfigurable to a lesser extent (swapping pieces out and in) - but if you have that, why not make a bunch of ASICS that do what those precompiled bitstreams do?

    Not that reconfigurable hardware isn't a neat and exciting paradigm, but these claims are so much cow feces (for now, at least).
  • Who cares? Does it run Linux? :-)
  • Buried in the article it says it can run NT or Unix. Ok, so maybe they've got some sort of intel or alpha cpu buried in there for controlling all the fancy reconfigurable processes.

    But they expect us to believe they've designed radically new hardware as well as a brand new fancy programming environment which runs on multiple platforms in the short life of their startup, and it was all done by 1 guy?

    Ok. Sure.

    Think what he could do if started hacking linux.
  • by martian ( 7513 )
    Wahey! I've been waiting for significant developments in reconfigurable computing with FPGAs - these are great chips...
  • They are announcing this so-called hyper computer, and they don't even have any patents on it.

    This is transparent nonsense. The only hint of legitimacy derives from the fact that they scammed a couple of Mormon newspapers and TV stations into buying it.

    Nice little fantasy though.
  • Pigs prepped and ready to fly sir....
  • Doesn't everyone and his dog recall the news stories over the last few months about the HP researchers building huge reconfigurable arrays of FPGA's and getting stonking performance out of them even when there were high numbers of defunt chips in the mix?

    There's also at least one company producing circuit simulation platforms hundreds/thousands of times faster than pure software simulation platforms, for the IC design industry.

    What marks these guys out is that they can write hype with the very best Microsofties.

    Heck what do I know, they might be the same guys after some marketing courses.
  • Sure, they wrote an ASIC for doing a 4-bit adder and programmed it into a small legion of FPGA chips... that's the way a machine of this architecture operates. If you want to do satelite communications you load in an asic designed to do satelite comunications. If you're also doing text to speech conversions, then you swap in a text to speech asic... It's a parayne shift in computing. You reprogram your FPGA chip to do different tasks on an as-needed basis...
  • http://www.starbridgesystems.com/Pages/release_1.h tml

    Looks like the perfect 3D accelerator... (-:

  • I'm not sure I want to know about your baloney detector.
  • this sounds like a very bad april fools joke... can't be true, can it? i'd like to hear from somebody who has really seen the thing WORK with his own eyes.
    sorry, VERY hard to believe....
  • This article http://www.newscientist.com/ns/990109/newsstory1.h tml
    describes a similar machine where each FPGA simulates a bunch of neurons and cycles through 300 bunches a second giving an effective neural net of 40 million neurons! They are trying to get it to control a robot kitten in an intelligent way.
  • April Fools Day is in APRIL.
    The owner of this domain is probably peeing his pants laughing, the domain registration money and site design time well spent.
  • Have a look at this patent [ibm.com], issued to its inventor, Kent L. Gilson.

    Abstract:
    An integrated circuit computing device is comprised of a dynamically configurable Field Programmable Gate Array (FPGA). This gate array is configured to implement a RISC processor and a Reconfigurable Instruction Execution Unit. Since the FPGA can be dynamically reconfigured, the Reconfigurable Instruction Execution Unit can be dynamically changed to implement complex operations in hardware rather than in time-consuming software routines. This feature allows the computing device to operate at speeds that are orders of magnitude greater than traditional RISC or CISC counterparts. In addition, the programmability of the computing device makes it very flexible and hence, ideally suited to handle a large number of very complex and different applications.

  • But can you protect the OSes from each other?
  • The company's stock is not for sale. Damn, get me in on the IPO.
  • Hmm, sounds just right for an informercial, I wonder if it fits under the bed...
  • The real trick in systems like this is getting the advertised performance out of them. Yes, if you managed to pack those FPGAs to their theoretical maximum with multipliers, run them at their theoretical maximum clock rate, and do ABSOLUTELY NO COMMUNICATION, you might beat IBM's blue supercomputer. Maybe.

    But you don't EVER get real-world performance like that, for several reasons. One, you have a very complex piece of software that "compiles" the program for that sea of FPGAs, and your utilization is only as good as that software (and complex doesn't begin to describe it). Second, once you introduce communication into your computational model, everything goes to hell. You have to include room to route data (and hence "wires") between the chips, and that just eats up everything.

    Just for the record, I contracted for a company that built a virtually identical box, used for chip emulation. It had ~300 Xilinx chips, and had a VHDL->Xilinx compiler and router. You COULD run Win95 AND DOOM at a clock rate of about 1000 HZ (i.e. VERY SLOW). But companies bought them. The price was about $750k to $1M, and that included a BIG profit margin. This thing is way overpriced. If you need more convincing of that, look up how much the EFF built Deep Crack for, and that was a one-of-a-kind box.
  • In fact, without going into details, the company is a competitor of Quickturn, and they did indeed get 1000Hz emulating a Pentium (that would be a VANILLA pentium, of course, not a MMX or P-II). It played Doom veeeeerrryyyyy sloooooowwwllly :-)
  • It's not that I don't think they have a germ of a good idea, but this has got to be a public relations ploy---I've seen too many start-ups "steered" by an "idea man" without clue one and drive what could have been great technology into the ground because of just such outlandish claims.
  • IMHO this is just a bullshitty-buzzword-filled article.... :)

    You try to give everything to everyone, and you end up giving nothing - e.g. WinNT/9x...

    And besides - who spends 15 years working in the very low level processor design field, and ends up with a *tremendous* break through in the OOP products (as they claim in their release)

    Flame On!
  • Ok, this super-computer-FPGA things looks like a stupid PR thing, but don't write stupid things about nanotechnology.

    lead and gold are different atoms so nanotechnology (building object by assembling precisely atoms) can't help here.

    But diamond is another story :-)
  • There are a few fundamental problems with this architecture which the author of the article overlooks:
    • FPGAs are bulky.

      The largest FPGA that I've heard of had a million gates on it. Pick-your-random-processor has 10-20 million transistors, giving it a high single-digit equivalent number of gates. Implementing anything with FPGAs will take up several times more space than using a custom chip.

      This means that you will have a _big_ supercomputer.
    • FGPAs are slow.

      While FPGAs are reconfigurable and hence very flexible, the implementations that they come up with for a given logic configuration aren't optimal. This, combined with the performance overhead incurred by the components that make it configurable, mean that an FPGA with a given logic pattern burned into it will be slower than an equivalent, optimized logic pattern implemented in CMOS.

      This is another important point - CMOS. While the machine on your desk may use CMOS or might add BiCMOS in there for a speed boost, supercomputers and servers have significant amounts of ECL circuitry in them to speed up critical logic paths. ECL technology is based on bipolar transistors, which switch much more quickly than the MOSFETs used in CMOS but generate far more heat. Used sparingly with aggressive cooling, they can double the performance of a chip or more. This leaves CMOS chips in the dust, and by extension anything built with an FPGA.
    • Flexibility Is Useful, But Not Phenomenally So.

      If you're shelling out the money for a supercomputer, then you have a good idea of the classes of problem that you're going to be running on it. This lets you choose the type of processor and interconnection architecture that you use so that it matches the problems that you plan to be running. If necessary, you design a custom ASIC for even better performance (as was done with Deep Crack). A reconfigurable architecture that was magically as fast as hybrid ECL/CMOS still wouldn't get you much of a performance boost, because you're already fairly close to an optimum hardware implementation. With modern processors, this is expecially true, because the on-chip scheduling and pipelining is good enough to keep most of the chip busy if the problem even approximately matches the chip's logic capabilities.
    • Communications Can't Be Reconfigured That Easily.

      There was much mention in the article about using processors that were tightly coupled. They'd need to be, to share logical functions with each other. However, this is extremely difficult to accomplish even with conventional processors. The communications traffic goes up with the clock speed and as the square of the number of processors (until it saturates the processors, at which point it goes up linearly). Processors have enough trouble communicating with other chips as it is; this is why new memory architectures are coming out. Asking n=lots processors to communicate tightly with each other and with memory in a reconfigurable manner is asking for a motherboard that can't be built. In practice, you'll wind up implementing either an n-cube architecture that allows fast communication but limits connectivity, an anywhere-to-anywhere mesh that has wonderful connectivity but seriously limits the amount of traffic that can be supported, or a hierarchial system using one or both of the above. The system as described just won't work.
    • The Compiler Will Be A *Bitch*.

      It's hard enough to optimize well with hardware that doesn't change. Figuring out the best way to implement an algorithm using both hardware and software feels like an intrinsically hard problem. Your compiler will have to try to solve this. IMO this will result in either a compiler that requires the user to explicitly state what they want done in hardware, or else a compiler that tries to optimize but does it badly, or else a compiler that is never finished.


    In summary, I think that there are a number of issues that the writer of the article was not aware of. I hope that the designers of the system took them into account, because otherwise this will be a neat-sounding project that disappears once the investors realize that there isn't going to be a product.

  • I think that if we could compile Linux into the native code, a computer would finally be worthy of running it.
  • Here's Xilinx's Virtex FPGA chip.

    Nine devices, from 50,000 to 1,000,000
    system gates (1,728 to 27,648 Logic
    Cells)
    Over 500 user I/O pins
    Many package options, including leading
    edge 1.0mm FinePitch ball grid arrays
    and 0.8mm chip scale packages
    Leading edge 2.5-Volt, 0.22 micron, five
    layer metal CMOS process
    Fully 5-Volt tolerant I/Os
    Timing-driven place and route tools allow
    compile times of 200,000 gates per hour
    (400 MHz Pentium II CPU)
    Vector-based interconnect for fast,
    predictable, core-friendly routing across all densities
    Fully 64 bit/66 MHz PCI and Compact
    PCI compliant

    Okay, so say they take 280 of these at the 1 million gate density = 280,000,000 gates. Currently, the Pentium II has 7.5 million transistors (probable 1.875M approximate logic gates)(http://www.zdnet.co.uk/news/1998/3 6/ns-5490.html [zdnet.co.uk])

    Just raw silicon. Lets say they had a bunch of pre-compiled circuits, then there wouldn't be any lag in switching as they say. (I must admit that 1,000 switches per second would be a little overblown.

    But - lets just say for equivilance, that we had 149 Pentium II's connected PARALLEL (Which is currently impossible. I think that LLNL uses PPro's currently at 2 chip SMP.) Such a system WOULD kill a cray. But the Pentium can't do that.

    Everyone who has read the Beowulf papers know that the overall speed of the system is entirly dependant on the lag of the interconnect between systems. So, for fun let say we could put 18 chips on a board, and put 20 boards on a 128 bit local bus. That would lead to some damn fast computing. (Remember Deep Crack on the last RSA contest? It only ran at a system speed of 80 MHz?)

    I believe they are at least on to it.
  • There is no way these guys are for real. First off there are already comapanies that are currently working on this technology, what makes them any smarter and knowledgeable than the next person. In addition, there is only mention of two people involved in this project only one with any sort of electrical engineering expertise. I highly doubt that they have a functional product. After visiting the web site it becomes very apparent that this organization is not one which is extremely professional(take a look at the press release). i am quite skeptical of any and all claims which these people have made. but hey, i hope i am wrong, this thing would kick a**.

    Zebra X
  • Speed is great but it looks like you'll run out of memory and/or disk space before a real simulation gets finished running.
  • I don't know... Like most revolutionary advancements in technology, it seems to have been developed by two different people simultaneously. One, a huge company, the other, a freaking genius who has been building them since he was 16. If it turns out to be baloney, then so be it. But stranger things have happened before.

    This definitely sheds light on what Transmeta has been up to, however, and why Linus is working for them. ;)
  • Actually, this seems like a quite elegant solution to the FPGA re-configuring problem to me. The paragraph about how their Viva software determines many times a second whether each function would be best done in hardware or software, and reconfigures the FPGA appropriately, sounds rather elegant to me. I did some work with FPGA's once, and was limited to "booting" the processor from a PROM and executing silly little instructions; with this sucker, you could specify a metalanguage with ALL the possible tasks you could throw at the processor, and have the processor make its own instructions to execute whatever subset of the language that was needed to execute! Any additional power could be tossed in there by creating additional parallel instruction units to execute the most frequently executed instructions... the possibilities are mind-boggling.

    Of course, there is a limit to how many things you can throw at the system at one time; The ridiculously high benchmark was for a 4-bit adder; of COURSE a bunch of special-purpose parallel chips that did nothing but 4-bit ADDs would be able to outperform a cray with the speed and power reductions mentioned. But I'm betting flipping the FPGA into a dynamic x86 emulation mode, with instruction parallalism, would slow the whole thing down to something more reasonable. Even still, it should outperform anything the x86 chipmakers currently manufacture. The graphics possibilities are what really grab me; the documents were mentioning that as well. Imagine a 3D accelerator that doesn't have a fixed instruction set! If the application was coded to use a particular instruction 80% of the time, Viva would adapt the FPGA grid to create massive numbers of parallel execution units for that instruction, and the speed would go through the roof.

    I'm really happy about this; I wonder how the mainstream processor manufacturers are going to react once the possibility of this thing becoming mainstream technology shows up in the press.

  • ... they'd be able to spell "scalar" correctly in their press release. :-)
  • by BiGGO ( 15018 )
    How the hell are they going to such a thing???
    I cannot understand how is that technology going to work at all.
    Except for "magic" or "ghosts".

    (reprogrammable chips? like EEPROM? doesnt sound logical to me)

    Please explain...
  • I'll believe it when I hear it from a reputable source. Company press releases and local news stations are notorious sources for hoaxes.

    Bleep!

  • Is this DiMora above the greatest engineering minds or will he give the media more, obvious, fun.

    Will Salt Lake be dubbed the "Shifty City" theologically, commercially, . . . quite technically?
  • of course it would be cool if this thing lived up to the press release. but utah is quite the fraud capital and their marketing committees do pull a few extra punches to get the job done.

    maybe the former olympic committee members have something to do with this...

    man, if i could get fuzzy dice for my rear view mirror, put this thing in my trunk, i could have a car mp3 player AND composer - because it would write original kick-ass mp3s. the babes would be all over me.
  • I think a twelve gauge shotgun would convince me of the fault tolerance. Otherwise, I'm not forking out $26 million for one.
  • Is this how Utah writes articles? I've not read something so badly written in a very long time. If there wasn't an exclamation mark at the end of each sentence, I would have at least expected a big shiny happy clown face to have appeared! =-)

    "It's really fast! >" "You could shoot it and it'll keep working!! >" "It can rearrange its wires a thousand times a second!! >"

    Next time, Utah News, some *fact* and *detail* would be nice.
  • At "60,000 times the speed of a home PC," this thing should be able to raytrace in REALTIME (30 fps) what would take my humble K6-2 nearly half an hour per frame.

    *shudder*

    All we would need then is a neural VR interface, and then things could get really interesting...
  • Someone do the math -- how long would it take for this monster to decrypt everything ever sent across the net.
  • Although the weather feels like spring lately, I don't think it's April just yet. Even so, the whole mission statement smacks of bull$hit. I'd love for it to be true but it reminds me of the feeling I got from reading press releases from many claiming to have invented a perpetual machine, discovered a room-temp superconductor recipe, etc...

    Wouldn't mind being wrong though...
  • When you go to internic and search for "starbridge" you get:
    ------------------------
    Whois Query Results

    Starbridge (STARBRIDGE2-DOM) STARBRIDGE.NET
    Starbridge Communications (INTERNET37-DOM) INTERNET7.COM
    Starbridge Communications (ADULT24-DOM) ADULT21.COM
    Starbridge Communications (SEXSUPERSTORES-DOM) SEXSUPERSTORES.COM
    Starbridge Communications (SEXYNUDEGIRLS2-DOM) SEXYNUDEGIRLS.COM
    Starbridge Communications (FREDDIEFREAKER-DOM) FREDDIEFREAKER.COM
    Starbridge Communications (FREAKER3-DOM) FREAKER.COM
    Starbridge Communications (STARBRIDGE-DOM) STARBRIDGE.COM
    -----------------------
    Interesting scam though... what do THEY get out of it?
  • I should have searched internic for "starbridgesystems" instead of just starbridge. Here's what you get then...
    -------------------------
    Registrant:
    Circa 65 (STARBRIDGESYSTEMS-DOM)
    208 1/2 25th Street
    Ogden, UT 84401
    US

    Domain Name: STARBRIDGESYSTEMS.COM

    Administrative Contact, Technical Contact, Zone Contact:
    Light, Doug (DL8191) dlight@LGCY.COM
    (801)994-7337 (FAX) (801)994-7338
    Billing Contact:
    Gleason, Matt (MG11172) MGLEASON@CIRCA65.COM
    (801)392-2600 (FAX) (801)392-6350
    --------------------------------
    So, maybe they're legit?
  • And what are 'cellular autonoma' anyway?
    Are they like cellular automata, but loners?

    And do you get the impression that they only
    have (at most) one working system?

    The whole things just seems like some CS-naive
    EE has managed to get carried away with himself.

    It reminds me of other naive projects, like
    'The Last One' (UK, circa 1982, which was to
    be the last program ever written, because it
    would take a natural language description of
    the problem and write the solution for it), or the
    miracle data compression algorithm reported
    in Byte magazine a few years back that could repeatedly compress its own output without
    data loss. Both failed, because both were
    quite stunningly naive about the nature of
    the problems they were trying to solve. This
    super-specific hypercomputer sounds just the same.
  • I'm a configurable computing researcher so I'm a fan of the FPGA and what is has to offer to the computational world. This announcement, however, is a typical demonstration of how to abuse computer benchmarking. Their claim of being the world's fastest computing architecture is based on comparing their 12.8 TFlop number to IBM's 1.2 TFlop number. But, they're just numbers. To understand what they mean, you have to look at the units. The same way that you can't say that 12 inches is a lot longer than 1 yard, you can't really say that executing 12.8x10^9 4-bit add operations is more than IBM's 1.2x10^9. For one thing the press release doesn't even say what the unit on IBM's number is. At the minimum it's probably a 32-bit integer add and it's probably an IEEE floating point operation. Comparing a 4-bit add to IEEE 32-bit floating point add is like comparing an inch to a yard.
  • I've reviewed the information about HAL, and am very disappointed. There are misleading or just plain wrong information at the web site. I'm not saying that it's not a good machine,just that i would be careful drawing any real comparisons from the HAL documents.

    1) If you read the "About Company" link, it states that in fact they have not actually built the machine that they claim will reach 12.8 Tops. It states that it will be built by Feb 99. I believe that they scaled the performance of a machine much much smaller than the one they claim betters the IBM machine. Note that scaling performance of a smaller machine is considered a big faux pas, as it totally neglects parallel overheads.

    2) They quote 4-bit operations as though they are equivalent to the 32-bit Floating point ops that were tested on BlueMountain. They are not similar. The performance drops off to 3.8Tops when they use 16-bit operations, but the IBM is still doing twice as much real work because they are using 32-bit math.

    3) They have not actually stated which benchmarks they used (if they actually had a machine to test). The IBM used (i think) the LinPeak, which is a matrix multiply operation. That benchmark is very good at showing off parallel architectures. But the size of the matrix must be quoted in the results. I don't see that information.

    4) The HAL computer is very under memoried. This might be easy to fix. The rule of thumb is 1byte RAM per Flop. That's why the BLUE machine has 2.5Tbytes of RAM for 3.8TFlops. The max listed for HAL is 100GB, or about 1-2 orders of magnitude too small.

    5) On one page at StarBridge, it lists the I/O performance in comparison to a Cray T3E-1024. But the numbers are differnt on the press release. In the press release is sez 50GB/s, but the company link indicates 50MB/s. 3 orders of magnitude different. I wonder if one of thoses is actual I/O data, and one is (50GB) is supposed to be memory bandwidth, which is a very very different concept from I/O. If the machine only has 50MB/s I/O, but 100GB of memory, then it will take about 2000 seconds to load memory from disk (or, say, to do a check point of the current data set).

    6) The marketing data for the Cray T3E-1024 is wrong, which in my mind negates most of the comparisons. The Cray T3E-1024 does not cost $76m, but the Cray/SGI Blue Origin2000-3000 does cost about that much. It states the Cray does not have fault tolerence, but it most certainly does. It states that the maximum I/O of the T3E is less than 2GB/s, which i know is wrong. (Did someone in marketting write this without double checking the numbers??)

    7) The company link states that the HAL 4rW1 has a minimum sustained rate of 3.8 Tera-ops. Would any company really claim that there is no program that they could run that would not perform worse. If that is a challenge, i'd be willing to put a huge bet down that i could write a program that will give LESS than 3.8 Tops!!!

    8) programming. On the company page, it does not state that you can program this in C or Fortran or any other common language. Instead it only talks about the GUI that you can use to describe your problem, and the software will automagically start running (at 3.8Tops minimum!!). This is a HUGE drawback if this is true (no C or Fortran). Also how can they possibly claim that their machine does not have the same parallel exectution overheads that are common on other parallel machines?? Just because you can change your network on the fly (and just how long does that change take??) does not mean you are immune to Amdahl's law!!

    So to sum up:
    1) 4-bit math is NOT equivalent to 32-bit math.
    2) What are the benchmarks and data sets?? What were they programmed with? Was the test actually run with a full scale machine, or just scaled results??
    3) How to deal with Amdahls law?
    4) Hardware is under memoried, and very likely less than perfect I/O performance.

    -r.
  • The ASCI RED machine is not at LLNL, but at Sandia. It does use PPros. But a parallel machine with P-II's is also possible. It just takes an investment in designing your own chip-set. The PPro chipset supports low order SMP natively in most chipsets. But for the RED machine, Intel used custom networking chipsets to get the large machine. This can be done with Pentium-II's if you wanted to do that.
    And a 149 processor Pentium-II would not kill a Cray. Well, not a large one at least. I would guess 149 processor P-II would come in around the same as a 100 processor Cray T3E or Origin2000. But since they are selling T3E's as large as 1380 processors, and Origins in the thousands of processors, a small 149 processor would not come close.

    -r.
  • It seems pretty simple to me. If it was fake and making money was the scam then the company would have to be publicly traded, its not. Since the companys only income is from licensing the technology then they would have to have the goods.

    So, I don't think it is fake.
  • This reconfigurable machine is a bunch of BS. Here's why:

    The fastest reconfigurable Xilinx FPGA is a 200MHz part.

    Assuming that you can do 1 FLOP in one cycle, that means that their machine would need 16 TFLOPS / 200MHz = 80,000 computational elements.

    This same FPGA has 500,000 gates per chip. Assuming that you could fit 100 floating-point pipelines onto this FPGA (5,000 gates per pipeline), this would mean that they would need 800 FPGAs. Does this fit in their box that they said? I think not!

    Think about it from the power side as well. If we assume that those FPGAs are 5W chips, that's a total of 4kW of power consumption. Minimum.

    And we haven't even breached the subject of adding support components like real logic elements for control, RAM for storing any kind of software that is needed, I/O subsystems, software support layers (Oh yeah, that's right, maybe Viva is the real-language OS that the government has been secretly developing all these years to do mind control on us all ;)

  • Hey, maybe now we have enough hardware to run Windows 2000's bloat. :)

    ---
    gr0k - he got juju eyeballs - http://www.juju.org [juju.org]
  • What fps will I get with it in quake2?
    If it's over 1000 I buy one! ;-)
  • If they built this supercomputer-class device, why would they put an RJ-11 connector on it?

    Witness http://www.starbridgesystems.com/about.html under the features of HAL-4rW1.

    Why bother?

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...