Stories
Slash Boxes
Comments
typodupeerror delete not in

Comments: 90 +-   Quick and Dirty Penryn Benchmarks on Saturday August 25 2007, @07:04AM

Posted by kdawson on Saturday August 25 2007, @07:04AM
from the they-don't-remember-quick-they-just-remembr-dirty dept.
intel
technology
An anonymous reader writes "So Intel has their quad-core Penryn processors all set and ready to launch in November. There are benchmarks for the dual-core Wolfdale all over the place, but this seems to be the first article to put the quad-core Yorkfield to the test. It looks like the Yorkfield is only about 7-8% faster than the Kentsfield with similar clock speeds and front-side bus."
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • My recent experience with quad-CPU Xeon machines is that multithread performance for a single is VERY poor, even with great care in coding, presumably because of cache-sloshing between these physically-separate CPUs dropped onto one die.

    (I compare with Niagara and even Core Duo which seem much better for threaded apps.)

    Has anyone else tested threadability of these CPUs, and power efficiency, sleep states, etc?

    Rgds

    Damon
    • They could probably make better use of the die space of the 4th, 3rd, or even 2nd CPU core by putting things like cache there instead. And in another direction, go with SoC (system on a chip) or certain subsets thereof. Combined with serialized bus technologies, this should work while also reducing pin counts.

      • Well, what's nice about my Niagara T1000 box is that everything is on one chip, and the outermost level of cache serves all CPUs, so even a nominal cache flush for volatile/synchonized never need leave the chip and hit real RAM.

        I'm just concerned that threading seems poor when you really do have to go to memory to get data between CPUs, and your idea of giving up some individual cache for some shared cache would be quite right if Intel had the engineering time to do it.

        For my latest nasty performance surpri
      • They could probably make better use of the die space of the 4th, 3rd, or even 2nd CPU core by putting things like cache there instead.

        The benefits of having extra cache drop off very quickly above certain cache sizes (depending on the addressable RAM the cache is indexing). A lot more is involved with improving level-0/1/2 cache performance than just upping the cache size.

        I'd expect greater benefits from moving dedicated (but programmable) VLIW units into the CPU to increase instruction-level parallelism, f
      • They could probably make better use of the die space of the 4th, 3rd, or even 2nd CPU core by putting things like cache there instead.

        Except you won't pay the price.

        They can charge you more for a 4 core CPU with shit amounts of cache than they can with a dual core with shed loads. People are stupid. They assume more megahurts means more fast and more cores means more fast... Whether the additional cores are actually doing anything at all.

        Business CPUs it's a different matter, they actually benchmark their apps and yup, buy CPUs with loads of cache when they're faster.

        And really, the best thing they could do is add an FPGA.

      • For multi-threading apps, instead of multiple cores (nothing except caches are shared between complete CPU cores), it makes a lot of sense to have an HT-like architecture (multiple context stores, shared elements) that reduces the time it takes to do a context-switch. It would also help a lot to have a context-aware cache system where a swapped-in context would not wake up having to read every instruction from main memory.

        Since not all threads will be runnable at any given time, having more cores instead of
    • Re: (Score:2, Informative)

      Intel's Core Microarchitecture is not currently available in a quad-CPU platform. It is understandable the multithreaded performance would be poor, then.

      The current quad-cpu architecture is based on Tulsa, which a 65nm shrink of Paxville, which is essentially a Pentium 4 Smithfield, or two Prescotts shoved onto one ship. Basically, it's two years ago's technology. The new Tigerton chip will be in Core based, however, it's not out yet.
      • How did this get modded informative?

        Intel's Core Microarchitecture is not currently available in a quad-CPU platform.

        Incorrect. Intel's "Core Microarchitecture" is marketed under the name "Core 2." The "Core 2 Quad" processors use the Core Microarchitecture. See Intel's product brief [intel.com] on the subject.

        It is understandable the multithreaded performance would be poor, then.

        The single threaded performance of quad core is similar to the single threaded performance of dual core, clock for clock. This should have ti
    • Your experience isn't shread by me, or by most other benchmarkers. Take a look at multi-threaded SPEC benchmarks for the Xeon 5300 series. SPEC_int_rate 2006, SPEC JBB_2005, etc, all show the Xeon 5300 as the clear per-socket performance leader for x86 systems. The quad-core Xeons are only bested by the IBM POWER 6, and Niagra in the Java benchmarks.

      See the SPECint_rate 2006 [spec.org] results page, and filter on two-chip systems.

      Perhaps your particular application is a degenerate case for the 5300s cache architectu

      • The SPEC benchmarks are _almost_ perfectly parallelizable. They are just multiple instances of a single-threaded benchmark, and as such don't really test all the things that arise in true multi-threaded programs (cache line bouncing, etc).
        • Take a look at SPECjbb2005 or TPC-C, which resemble "real" applications a lot more than SPECint_rate. The Quad-core Xeons are 70-100% faster than the fastest dual-core Opteron systems.

          As much as I wish it weren't so, AMD has been toasted in the two-socket server space, which is the largest part of the server market. Barcelona proabably won't change that, as Penryn will arrive at the same time.

  • "Intel expects SSE4 optimizations to deliver performance improvements in video authoring, imaging, graphics, video search, off-chip accelerators, gaming and physics applications. Early benchmarks with an SSE4 optimized version of DivX 6.6 Alpha yielded a 116 percent performance improvement due to SSE4 optimizations." Not bad...
      • Who makes an SSE version of a function w/o a regular x86 version to fall back on when SSE isn't available?

  • Seriously (partly, at-least) : How many penguins I will see during the boot-up? 4?
  • by Dachannien (617929) on Saturday August 25 2007, @08:33AM (#20353301)
    Penryn? Wolfdale? Yorkfield? Kentsfield? What are they doing here, making processors, or naming streets in a new upscale subdivision?

  • AMD rose to this position primarily because they didn't make Intel's mistakes - trying to force a new CPU architecture on the market (Itanium) instead of incrementally developing the X86 line, and focusing on clock-speed (P4) at the expense of performance per watt. Now that Intel is focused on performance per watt, AMD needs to find a new differentiator for their chips.

    Perhaps they should start thinking about how to integrate a high quality Vista-capable GPU into their processors? (afterall they acquired ATI). How about sound cards, USB ports, et cetera. If they can fit 90% of a typical motherboard into the processor and usher in a new era of affordable and efficient computers while intel is busy playing with 64-core chips, why not?
    • They are doing exactly that.

      AMD is going the route of a true native quad core with Barcelona, coming out in september. They have the desktop version of that, Phenom, coming out closer to Christmas. Intel is taking the quick and dirty route to quad core - smash two dual core CPUs onto the same die. AMD is actually doing a proper quad core architecture.

      They have in their roadmap a GPGPU (general purpose graphics processing unit) for late 2008 or early 2009. I'm personally still trying to understand what t
      • Intel is taking the quick and dirty route to quad core - smash two dual core CPUs onto the same die. AMD is actually doing a proper quad core architecture.

                A 'smashed' Xeon runs much better than an AMD CPU that I can't buy. If I said AMD sucked because they took the 'quick and dirty' route with the K10's shared L3 victim cache, limited memory prefetching, and limited incomplete subset of SSE4 you'd probably just say those are buzzwords.
      • Intel is taking the quick and dirty route to quad core - smash two dual core CPUs onto the same die. AMD is actually doing a proper quad core architecture.

        Do you think that the fact that the Intel method is cheaper due to higher yield is irrelevant? With a single-die quadcore, the entire processor needs to be discarded if just one core is broken. With dual-die quadcores, you only need to discard one half of the processor. This increases yield and lowers costs, and I cannot see what is so bad about that. Performance isn't everything, and it isn't like it suffers greatly from the dual-die design. I'd guess that it suffers more from the shared FSB design.

          • I don't think having two dual-cores in a package instead of four cores combined is necessarily a disadvantage. To compare these properly, you would have to assume same quality of implementation. So Intel could have gone for one unified 12MB L2 cache with four access paths instead of two 6MB L2 caches with two access paths each. With same quality of implementation, the four access paths will be slower because you have to cope with four processors accessing it at the same time instead of two. So each access w
    • AMD still seems to be doing good design but their fabbing lags Intel by a year. I think it's Intel's fab technology that carried them through despite their other technology misdirections. I hope that the results of the ATI merger become a long term positive, it seems to be holding them down in the short term. Betting on on-die GPU is quite a serious bet, quite a bit more serious than an on-die memory controller in my opinion, especially when they go into major debt to acquire another large company just t
  • Although Yorkfield uses a 45nm fab process and consumes less power, Intel plans to stick to its existing 95 Watt and 130 Watt thermal design power ratings.
    I don't get it, does it use less power or not? Or does this mean it uses less power per cycle, thus allowing them to ramp up the clock until it's back up to 130 watts?
    • Or does this mean it uses less power per cycle, thus allowing them to ramp up the clock until it's back up to 130 watts?

      Yes, they are increasing the clock to maintain the same TDP.
  • by Terje Mathisen (128806) on Saturday August 25 2007, @02:12PM (#20355363)
    When decoding "full HD" h264, i.e. 40 Mbit/s BluRay or 30 MBit/s HD-DVD, with 1080p resolution, current cpus start to trash the L2 cache:

    Each 1080p frame consist of approximately 2 M pixels, which means that the luminance info will need 2 MB, right?

    Since the normal way to encode most of the frames is to have two source frames and one target, motion compensation (which can access any 4x4, 8x8 og 16x16 sub-block from either or both of the source frames), will need to have up to 2+2+2=6MB as the working set.

    Terje
    • by eebra82 (907996) on Saturday August 25 2007, @08:03AM (#20353151) Homepage
      I would think that AMD would be providing Barcelona benchmarks hand over fist, at this point, if they had something...

      There are two possible situations here:

      a) Barcelona is faster than Intel's current line-up and does not want to see Intel up the pace more by releasing such numbers.
      b) Barcelona is slower than Intel's current line-up and does not want its shares hit a new low, or perhaps buy some time to speed it up.
      • by CajunArson (465943) on Saturday August 25 2007, @09:43AM (#20353625) Journal
        Barcelona is faster than Intel's current line-up and does not want to see Intel up the pace more by releasing such numbers.

            That may have been true 6 months ago, but the K10 is supposed to be officially announced in about 16 days on September 10 (since AMD claims not to do paper launches it is supposed to be widely available then too... ymmv). AMD is not going to be able to stop benchmarks after it is released, and while Intel can adapt quickly, it can't turn on a dime in 2 weeks time. AMD has not been doing well in the PR and benchmarking battles since Core 2 came out, if K10 really was that amazing you would be seeing all the usual suspects putting out full reviews right now in order to generate hype. I'm leaning towards your second theory, and most analysts are too.
    • Re: (Score:3, Informative)

      Depends on what you do "at home". Grandma who only sends email and orders flowers will see zero benefits.

      But the rest of "normal" home users who own things like camcorders, make DVDs, rip movies, etc all see a huge benefit. I just put together a Q6600 system and couldn't be happier, but I've been a dual CPU workstation user since the PII days.
      • I was debating between Q6600 (2.4GHz, 4 cores) and E6700 (2.67GHz, 2 cores), and I have chosen second option, because of limited advantage of more cores, but always present advantage of higher clock speed.
        • I read a comparison/benchmark someplace (Ars? Who can remember..) that showed the E6700 only a touch better at a narrow range of applications and getting its hat handed to it on media encoding applications, so I went with the Q6600 since that accounts for my "heavy" computing.

          I see MPEG-2 renders running better than real time on single pass encodes in TMPGEnc.
        • I am running a Q6600 on an eVGA 680i motherboard, should have just gone with the Q6600, I have this sucker clocked at 3.3Ghz, am pushing for 3.4 but I can't get it stable yet.
    • "And of course.. 4 core CPU has no use at homes unless you are content creator. I'm software engineer, I don't think that any of my colleagues I work with knows how to write app that will take advantage of 2 cores; let alone 4.

      Conclusion? 4 cores right now need much software support."


      Well, you're talking about cutting-edge CPU:s which typically co-exist with cutting-edge software. If you're getting a quad core setup, it's probably because you're going beyond Word processing.

      Of course quad cores wil
      • 'it's hardly a bad thing to prepare for the future when you're purchasing a computer.'

        yes it is, it costs you extra money & hardware comes down in price quickly, if you buy a high end cpu now that you wont use for another year, you're wasting your money. in a years time your high end cpu will be mid range & a lot cheaper, so it'd probably be cheaper to buy a mid range or budget cpu now, then another one in a years time. then you can get a bit of money back for the old one on ebay too.
        • You're turning a discussion about buying high-end hardware into "best bang for the buck". Where in my quoted statement can you see me saying anything about buying the top of the line hardware and that it is the best choice?
    • Or if you use Linux... Because that support has been standard for quite some time now. They even rotate which CPU gets priority so that heat and usage gets distributed evenly.

      Yay for processes that make sense!
    • by Sycraft-fu (314770) on Saturday August 25 2007, @10:09AM (#20353765)
      Intel tends to do a release of a new architecture, then some refinements on that. While it would be cool to do a whole new architecture each time around, there's just not really money for that. This is one of the refinements. The chips are not likely to be all that much faster then their previous chips at the same clock speed because they are largely the same architecture. Mostly they are just a die shrink (which means lower power and probably better scaling and cost) and some new instructions, that aren't really used yet. They are still Core 2s.

      However that doesn't mean that the next generation will be the same. Indeed, if Intel keeps with their plans it will be a new architecture and thus hopefully bring new speed increases.

      As to using multiple cores, well if you don't know how, perhaps you'd best learn then? You not knowing how doesn't mean it can't be done, indeed it can be done and IS being done. Multi-core is just the way things are going, at least for now. Not only are desktops and servers headed that way, but even things like the Xbox 360 and PS3 are as well. It's simply time to start thinking about software in a different way. No longer is a big while loop the way to go.

      Already that's happening. The number of games (and games are interesting to watch since they often ride the leading edge in terms of requirements) that makes use of two cores has risen dramatically. We are also seeing a couple games, with more on the horizon, that will support 4 cores. Things like AI and physics get executed in parallel, which makes it possible for them to be much more complex.

      Finally, there HAVE been some cool developments on processors, just not ones that most hardware sites like to cover. Some time back Intel introduced a technology they call VT, which is basically instructions to allow you to virtualize the protection rings on a processor. Supposed to make for faster VMs. Currently the implementation is somewhat lacking, VMware claims it is slower than a well optimised software solution, though others dispute that claim (Xen likes VT). The new 45nm Core 2s add to the existing VT technology with what Intel calls VT-d. Basically the idea is to allow VM software to pass DMA access to their guests, but in a safe manner that can't hurt the host. This may not be exciting to everyone, but these advances are worthwhile, given that virtual computing is getting more and more use.

      Processors may not be getting huge gains in single thread performance any more, but that doesn't mean they aren't advancing.
    • NEVER underestimate the huge number of virus / trojan / spyware and pop-up generating crapware that are running in parallel on average joe's computer.

      Just think about the number of users who come into stores to buy "faster computers because the old one is getting too slow" when the old computer is crawling under an impressive amount of crapware.
      They are the perfect target for those new multi-core processors :
      - 1 core for running the OS, Internet Explorer and Microsoft Word.
      - All other core for running SPAM-
      • Sadly this is exactly the case.
      • Also never underestimate the huge number of anti- virus / trojan / spyware and pop-up crapware that are running in parallel on average joe's computer. My folks still use AOL. ("Security Edition".) Their computer is basically locked up whenever one of the several types of scans or automatic check for updates auto-launches. They need 3 cores for all those kinds of horribly-written craplets, and 1 to play Minesweeper.
    • "Conclusion? 4 cores right now need much software support."

      It goes beyond just that IMHO, right now the PC industry needs to get it's act together as a PLATFORM. And also for applications that don't break. One of the big things that is pissing me off right now is closed-source programs who's compatability breaks and because it's closed source no one can fix it/update it, etc to get it running when OS's and other technologies change. I think there really needs to be a legal framework for people (end user
      • I think there really needs to be a legal framework for people (end users) who own software

        You don't own software, you license it. Unless you contracted a company to write something for you and you explicitly retained the rights (and the source).

        Next, growth IMHO for certain industries like the game industry is being held back by not subsidizing the cost of some kind of mid-range performance standard graphics *for everyone*.

        You can get a DX10 graphics card for US$100 [newegg.com]. Or are you still using an AGP motherboard?

        I find it ironic that companies like Nintendo, Sony, and MS can subsidize their consoles, but when it comes to the PC, MS just sit's there.

        MS doesn't make PCs.

        I think one of the big reasons PC gaming is flagging was in large part due to the incessant march of the graphics card industry.

        Are you suggesting that game companies can't handle the increased power of new graphics cards?

        Starcraft and Diablo 1 & 2 were both 2D games, it makes sense that these games got as widespread as they did because they'd run everywhere.

        So, basically you just want a line of Cheap Bastard(TM) games? Why not just haunt the used games stores? I'm sure you can find something there that

    • by mi (197448) <mi+slashdot@aldan.algebra.com> on Saturday August 25 2007, @11:50AM (#20354413) Homepage

      4 core CPU has no use at homes unless you are content creator. I'm software engineer, I don't think that any of my colleagues I work with knows how to write app that will take advantage of 2 cores; let alone 4.

      Well, fortunately, some of this software has already been written just for you and your colleagues. Check out make(1) manual page — look for the -j option...

      And no, it is not only for software engineering either. Every time I come back from vacation, I use make [algebra.com] to convert my digital pictures from the lossless "raw" format of the camera to the lower resolution JPEG for the web-pages. Having four CPUs makes that process four times faster. Great idea, uhm?..

      Your colleagues may be doofusen, but people, who will finally bring us reliable speech-generation and parsing (as an example) will certainly be smart enough to take full advantage of the multiple processors.

      Meanwhile, you can schedule a meeting to discuss using OpenMP [openmp.org] in your company's software... Compilers (including Visual Studio's and gcc) have been supporting this standard for some years now.

I am a traffic light, and Alan Ginzberg kidnapped my laundry in 1927!