Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Graphics Software AMD Hardware

AMD's OpenCL Allows GPU Code To Run On X86 CPUs 176

eldavojohn writes "Two blog posts from AMD are causing a stir in the GPU community. AMD has created and released the industry's first OpenCL which allows developers to code against AMD's graphics API (normally only used for their GPUs) and run it on any x86 CPU. Now, as a developer, you can divide the workload between the two as you see fit instead of having to commit to either GPU or CPU. Ars has more details."
This discussion has been archived. No new comments can be posted.

AMD's OpenCL Allows GPU Code To Run On X86 CPUs

Comments Filter:
  • Re:Nice (Score:5, Informative)

    by clarkn0va ( 807617 ) <apt.getNO@SPAMgmail.com> on Thursday August 06, 2009 @01:56PM (#28975703) Homepage
    I suppose I could have been clearer. I'm talking about gpu decoding of HD video, conspicuously absent on AMD hardware in Linux, fully functional on NVIDIA. [slashdot.org]
  • by ByOhTek ( 1181381 ) on Thursday August 06, 2009 @01:59PM (#28975765) Journal

    So, you store the data the GPU is working on in the card's memory, and the data the CPU is working on in system memory.

    yes, it is relatively slow to move between the two, but not so much that the one time latency incurred will eliminate the benefits.

  • Overhyped (Score:5, Informative)

    by TheRaven64 ( 641858 ) on Thursday August 06, 2009 @02:00PM (#28975777) Journal
    Compiling OpenCL code as x86 is potentially interesting. There are two ways that make sense. One is as a front-end to your existing compiler toolchain (e.g. GCC or LLVM) so that you can write parts of your code in OpenCL and have them compiled to SSE (or whatever) code and inlined in the calling code on platforms without a programmable GPU. With this approach, you'd include both the OpenCL bytecode (which is JIT-compiled to the GPU's native instruction set by the driver) and the native binary and load the CPU-based version if OpenCL is not available. The other is in the driver stack, where something like Gallium (which has an OpenCL state tracker under development) will fall back to compiling to native CPU code if the GPU can't support the OpenCL program directly.

    Having a separate compiler that doesn't integrate cleanly with the rest of your toolchain (i.e. uses a different intermediate representation preventing cross-module optimisations between C code and OpenCL) and doesn't integrate with the driver stack is very boring.

    Oh, and the press release appears to be a lie:

    AMD is the first to deliver a beta release of an OpenCL software development platform for x86-based CPUs

    Somewhat surprising, given that OS X 10.6 betas have included an OpenCL SDK for x86 CPUs for several months prior to the date of the press release. Possibly they meant public beta.

  • by TejWC ( 758299 ) on Thursday August 06, 2009 @02:01PM (#28975793)

    Ok, I'll feed the troll (this time)

    Anyway, Apple was one of the companies that first came up with the OpenCL standard. Apple worked with Khronos to make it a full standard. AMD is one of the first to publicly release a full implementation of OpenCL which is why this is big news.

  • by Anonymous Coward on Thursday August 06, 2009 @02:10PM (#28975921)

    nVidia has had a full implementation of OpenCL out for months now.

  • What's the story? (Score:3, Informative)

    by trigeek ( 662294 ) on Thursday August 06, 2009 @02:29PM (#28976189)
    The OpenCL spec already allowed for running code on a CPU or a GPU. It's just registered as a different type of device. So basically, they are enabling compiling the OpenCL programming language to the x86? I don't really see the story, here.
  • Re:Nice (Score:5, Informative)

    by MostAwesomeDude ( 980382 ) on Thursday August 06, 2009 @02:37PM (#28976311) Homepage

    AMD/ATI only offers GPU-accelerated decoding and presentation through the XvBA API, which is only available to their enterprise and embedded customers. People seem to always forget that fglrx is for enterprise (FireGL) people first.

    Wait for the officially supported open-source radeon drivers to get support for GPU-accelerated decoding, or (God forbid!) contribute some code. In particular, if somebody would write a VDPAU frontend for Gallium3D...

  • by ShadowRangerRIT ( 1301549 ) on Thursday August 06, 2009 @03:00PM (#28976775)

    There's only two ways to do that:

    1. Some of the cores are specialized in the same way that current GPUs are: You may lose some performance due to memory bottlenecks, but you'll still have the specialized circuitry for doing quick vectored floating point math.
    2. You throw out the current graphics model used in 99% of 3D applications, replacing it with ray tracing, and lose 90% of your performance in exchange for mostly unnoticeable improvements in the quality of the generated graphics.

    Of course, you're reading this the wrong way. You think they are trying to replace GPUs with CPUs. They're really just trying to deal with the fact that some systems lack GPUs, and many systems with GPUs will have underutilized CPUs. GPGPU applications are using the specialized GPU hardware for a reason; falling back to CPU is for improved compatibility with low end systems and full hardware utilization on high end ones; it's not intended to get rid of the GPU (defined as any chip specializing in minimal branching, high throughput, vectorized floating point math).

    Take a look at Folding@Home sometime. They have a CPU and GPU client. They are both trying to solve protein folding problems. The CPU, being good at integer math, looks at the problem as a discrete particle simulation. The GPU, being good at bursts of floating point math, models the system in a continuous way (see their site for a complete explanation). While the GPU results have a small margin for error (due to FP rounding), they're still one of the best clients from the perspective of advancing the field, because on similar value hardware (say, an recent Core2Duo vs. a 8800GTX) they solve similar problems 5-10x faster. If they could run the GPU specific code on a CPU it wouldn't do them any good; since the CPU is bad at that type of problem, they'd end up doing worse than running the correct client on the CPU. The CPU clients can double check the GPU results if needed, but the GPU is by far the fastest at sorting plausible from implausible results.

  • by Anonymous Coward on Thursday August 06, 2009 @03:13PM (#28977017)

    From what I know of the spec, you would just create your kernel, feed it data, and execute it, the implementation will worry about sharing the work between the CPU and GPU to get optimal performance.

    No. Any individual OpenCL kernel runs solely on one device (be it CPU, GPU, or otherwise). If you want to run a kernel on multiple devices you must manually divide the work into multiple kernels and setup an OpenCL context on each device you wish to use.

  • by kramulous ( 977841 ) on Thursday August 06, 2009 @07:42PM (#28980597)

    Not at all absurd. I realise that the gpu is a compute workhorse. That's not the issue. It is the data transfer rate to and from the card. Transferring 3GiB takes quite a while. Pulling the results back takes a while also. That's what kills it. The cpu can get the work done in that time.

    I'm using the cuda blas routines, examples from the sdk and those published as 'glorious almighty' codes. Everything that the card does is timed as it is all time to solution.

  • Re:Overhyped (Score:3, Informative)

    by tyrione ( 134248 ) on Thursday August 06, 2009 @11:57PM (#28982333) Homepage

    AMD is the first to deliver a beta release of an OpenCL software cross development platform for x86-based CPUs

    Source: http://developer.amd.com/GPU/ATISTREAMSDKBETAPROGRAM/Pages/default.aspx [amd.com]

    Being able to target both Windows and Linux is something outside Apple's platform scope.

Always look over your shoulder because everyone is watching and plotting against you.

Working...