Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Supercomputing Hardware Hacking Build

Adapteva Parallella Supercomputing Boards Start Shipping 98

hypnosec writes "Adapteva has started shipping its $99 Parallella parallel processing single-board supercomputer to initial Kickstarter backers. Parallella is powered by Adapteva's 16-core and 64-core Epiphany multicore processors that are meant for parallel computing unlike other commercial off-the-shelf (COTS) devices like Raspberry Pi that don't support parallel computing natively. The first model to be shipped has the following specifications: a Zynq-7020 dual-core ARM A9 CPU complemented with Epiphany Multicore Accelerator (16 or 64 cores), 1GB RAM, MicroSD Card, two USB 2.0 ports, optional four expansion connectors, Ethernet, and an HDMI port." They are also releasing documentation, examples, and an SDK (brief overview, it's Free Software too). And the device runs GNU/Linux for the non-parallel parts (Ubuntu is the suggested distribution).
This discussion has been archived. No new comments can be posted.

Adapteva Parallella Supercomputing Boards Start Shipping

Comments Filter:
  • by Anonymous Coward

    The first comment to mention MAME or BitCoin wins.

  • by Anonymous Coward

    If all you are gonna do is advertise, at least do it right!

    There is no micro SD included by default and the connectors are micro USB and micro HDMI. Big fail!

    • Damn them for making it use the same connectors and including the same standard equipment in the base package as every other similar product...

  • by Anonymous Coward

    Very well:

    Imagine a Beuowulf Cluster of these!

    • With 64 cores, I'd say it's already a cluster. A dozen of these ($1200) would have 768 cores and fit in a microatx case. :)
      • "With 64 cores, I'd say it's already a cluster. A dozen of these ($1200) would have 768 cores and fit in a microatx case. :)"

        But what about performance? For example, how does it perform at parallel integer math (arguably the most common use for these things), as compared to a top-line, price-comparable GPU card?

        That's what I want to know. I didn't search for a long time, but I didn't find info on that.

        • by dbIII ( 701233 )
          It depends. If you can get them on a board that can address 32GB or more of memory directly then they'll be able to handle a lot of tasks that GPU cards just cannot touch without a lot of waiting around to be fed data or careful design of those tasks to get them to fit into the memory of the GPU card.
      • by AmiMoJo ( 196126 ) *

        I can fit over 9000 bottle caps in a medium sized rainwater barrel. Not sure what I'd do with it though.

  • by Sparticus789 ( 2625955 ) on Tuesday July 23, 2013 @03:10PM (#44364687) Journal

    I could buy enough of these to cover the underside of the floor of my house and mine Bitcoins during the winter. Then I get radiant heat and useless fake money (which is probably just NSA's password cracker anyways).

    • by Anonymous Coward

      I literally don't even use heating. My heating is computer based.

      I like to think of it as Efficient Heating.

      • If you're going to buy hardware for Bitcoin mining, there are much, much more efficient alternatives, and they still produce plenty of heat.
      • I think USD is useless fake money, because I cannot use it locally, but there are other places where you can use it to buy plenty of stuff, and then there are exchanges.
      • The software is open source so you can see if it's cracking passwords for yourself, instead of randomly guessing.
      • I may be able to line the bottom of my floor with GPUs, connected via a custom PCI extension cables into a large (really large) chassis. But if take numbers into account, I have about a 1,000 square foot house. Let's say an average sized GPU is 4" x 12" (just for round numbers). And let's assume that I place 2 GPU's per square foot. That comes out to 2,000 GPUs, and a lot of money.

        Think I will stick to wearing slippers in the winter.

  • So it's interesting, a light weight ARM processor, without anything better than micro USB and micro HDMI. Neat yes, but really? Useful? Maybe as a wireless router, or some other PoE like device but as a useful processing system? Um...

    Even linking many of these together - neat, but again, the world of MPI is based on completely different processor designs and interconnects, you're talking huge amount of time and effort to replicate something on a unique platform which may or may not ever see wide spread acce

    • Re:Tiny but useful? (Score:4, Informative)

      by Jeremy VanGelder ( 2896221 ) on Tuesday July 23, 2013 @03:23PM (#44364817)
      The ARM cores serve as a host for the Epiphany cores, roughly similar to the way an X86 CPU serves as a host to your video card. Epiphany is not ARM, it is a chip with a number of 1 GHz RISC cores that all communicate via a network-on-chip. So, it is optimized for doing a lot of floating-point arithmetic at very low power consumption.
  • by kwerle ( 39371 ) <kurt@CircleW.org> on Tuesday July 23, 2013 @03:16PM (#44364751) Homepage Journal

    Anyone out there in /.-land plan on getting these for a real project?

    Tell us about it! What language/OS/purpose?

    Just curious...

  • by IAmR007 ( 2539972 ) on Tuesday July 23, 2013 @03:33PM (#44364933)
    I'm skeptical as to how useful this chip will be. High core counts are making supercomputing more and more difficult. Supercomputing isn't about getting massively parallel, but rather high compute performance, memory performance, and interconnect performance. If you can get the same performance out of fewer cores, then there will usually be less stress on interconnects. Parallel computing is a way to get around the limitations on building insanely fast non-parallel computers, not something that's particularly ideal. For things like graphics that are easily parallel, it's not much of a problem, but collective operations on supercomputers with hundreds of thousands to millions of cores are one of the largest bottlenecks in HPC code.

    Supercomputers are usually just measured by their floating point performance, but that's not really what makes a supercomputer a supercomputer. You can get a cluster of computers with high end graphics cards, but that doesn't make it a supercomputer. Such clusters have a more limited scope than supercomputers due to limited interconnect bandwidth. There was even debate as to how useful GPUs would really be in supercomputers due to memory bandwidth being the most common bottleneck. Supercomputers tend to have things like Infiniband networking in multidimensional torus configurations. These fast interconnects give the ability to efficiently work on problems that depend on neighboring regions, and are even then a leading bottleneck. When you get to millions of processors, even things like FFT that have, in the past, been sufficiently parallel, start becoming problems.

    Things like Parallella could be decent learning tools, but having tons of really weak cores isn't really desirable for most applications.
    • But indeed, it is the learning experience that is required, because cores are not getting particularly faster, and we are going to have to come to grips with how to parallelize much of our computing. The individual cores in this project may not be particularly powerful, but they aren't really weak either; the total compute power of this board is more than you are going to get out of your latest Intel processor, and uses a whole lot less power. Yes, it isn't ideal given our current algorithms and ways of wri

      • Well, x86 CPUs are designed to do a hell of a lot more than compute. Their advanced caches and other complex features take a lot of die area but make them well suited for general computing and complex algorithms.

        You are right that our current algorithms will have to change. That's one of the major problems in exascale research. Even debugging is changing, too, with many more visual hints to sort through millions of logs. Algorithms may start becoming non-deterministic to reduce the need to communicate, fo
    • Very *nice* comment -- spot on.

      Only other thing to mention is that supercomputing trades latency for bandwidth. i.e. high latency but vastly high bandwidth.

      Intel does a great job of masking latency on x86 so we get "relatively" low latency for memory but it's bandwidth is crap compared to a "real" supercomputer or GPGPU.

    • by dargaud ( 518470 )
      So how does this compare to a, say, Xeon Phi [wikipedia.org] ?
    • by ShieldW0lf ( 601553 ) on Tuesday July 23, 2013 @04:25PM (#44365463) Journal

      This device in particular only has 16 or 64 cores, but the Epiphany processor apparently scales up to 4,096 processors on a single chip. And, the board itself is open source.

      So, if you developed software that needed more grunt than these boards provide, you could pay to get it made for you quite easily.

      That's a big advantage right there.

      • by AmiMoJo ( 196126 ) *

        So it's an evaluation board that may lead you to contract them for a larger, as yet undeveloped device? That's fine, but isn't really selling it us.

    • Parallel computing is a way to get around the limitations on building insanely fast non-parallel computers

      by limitations, i'm assuming you mean the laws of physics.

      Parallel computing is ... not something that's particularly ideal

      it's merely a new paradigm in order to continue processing data faster and it wont be the last.

      High core counts are making supercomputing more and more difficult. Supercomputing isn't about getting massively parallel ...
      collective operations on supercomputers with hundreds of thousands to millions of cores are one of the largest bottlenecks in HPC code.

      the Epiphany architecture is currently limited to 4096 interconnected cores because all the registers and memory (RAM) are memory mapped and the address space is limited. so if you are using 64 core chips it's 8x8 chips.

      Supercomputing isn't about getting massively parallel, but rather high compute performance, memory performance, and interconnect performance. If you can get the same performance out of fewer cores, then there will usually be less stress on interconnects.

      communication between cores is actually quite fast, 102 GB/s Network-On-Chip and 6.4 GB/s Off-Chip Bandwidth. so for 4096 cores, memory ba

      • by dbIII ( 701233 )

        by limitations, i'm assuming you mean the laws of physics

        It's still within the realms of manufacturing constraints at this point. A co-worker was making diodes a couple of atomic layers thick before 2000 but making a circuit at that scale in 2D is going to take a lot more work.

    • by phriot ( 2736421 )
      The stated goal of the project is to offer affordable access to tools for learning how to do parallel programming. At $99 for the board and very low power consumption, I would think that this makes learning easier than building your own cluster, no?
    • by dbIII ( 701233 )

      Supercomputing isn't about getting massively parallel

      A lot of it is even if it isn't all like that. For instance a lot of seismic data processing is about applying the same filter to tens of millions of traces, which are effectively digitised audio tracks. Such a task can be divided up to a resolution of a single trace or whatever arbitrary number makes sense given available hardware.

      So even if it "isn't really desirable for most applications" there are still plenty of others where it is desirable.

  • where do people get their definition of supercomputer? a supercomputer is what you have when your compute needs are so large that they shape the hardware, network, building, power bill. this thing is just a smallish multicore chip, like many others (now and in the past!)

    • where do people get their definition of supercomputer?

      From the 1960's. The CDC 6000's designed by Seymour Cray were the first "Super Computers". Each "Core" had about 30 mips.

  • This thing is promised to do 90Gflops and costs 100$. A HD7870 can do 2500Gflops for 300$. Sure, you need to build a rig around it, but you'll still be way better off then soldering together a tower of 25 of these boards.

    • now mount that HD7870 inside RC plane, or a quad drone
      the closest you can get is mali t604 doing 68 GFLOPS or mali t658 at 272 GFLOPS (theoretical numbers, but everyone including amd uses those)

      • bingo. if you've seen some of the crazy acrobatic stuff being done with quad copters over on TED that is using several remote PCs and remote control. The programming could probably all be packed into one of these boards and built right into each copter.

  • A super computer is a system that has multiple processors functioning in parallel. be it many individual machines networked together, a single processor with a several processors etc.

    The term supercomputer is a very old one back before you could even fathom purchasing a machine capable of housing multiple CPUs, well unless you were a university or very well funded trust fund geek

    by the original definition most of our phones are super computers

  • These boards are only half the solution to a parallel problem. I used to write satellite imaging software that was parallelized on a 12-CPU server. A lot of work went into the code necessary to parallelize the mapping and DTM algorithms. It wasn't trivial either. I'm failing to see the usefulness of these boards for anything other than intensive scientific computation. Because if the code being run isn't written for parallel processors, you're getting no advantage to running it on a multicore/multiproc

    • You are missing specialized applications written specificaly for that kind of system. What is indeed an easy thing to miss, because they don't exist, as that kind of system didn't exist until today.

      I would have brought a few 3 years ago, but I don't have a need for them now.

  • How is this not any different then IBM's Cell Processor? You know the one in the PS3. Sure it didn't have as many cores but its the exact same thing and it didn't do well. A big part of the problem was the overhead caused in memory transfer from the host system to the individual cores. The other part was each core only had 512Kb of RAM, these only have 32Kb!
    • by cruff ( 171569 )

      How is this not any different then IBM's Cell Processor?

      Can you actually buy a single Cell processor or even a dev board for one?

Avoid strange women and temporary variables.

Working...