Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Dual Cores Taken for a Spin in Multitasking

Posted by CowboyNeal on Fri Apr 22, 2005 04:53 AM
from the burn-up-testing dept.
Vigile writes "While dual cores are just now starting to hit the scene from processor vendors, PC Perspective has taken the first offering from Intel, the Extreme Edition 840, through the paces in single- and multi-tasking environments. It seems that those two cores can make quite a difference if you have as many applications open and working as the author does in the test." It's worth noting that each scenario consists of only desktop applications, and it'd still be interesting to see some common server benchmarks, such as a database or web server.
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by jawtheshark (198669) * <slashdot.jawtheshark@com> on Friday April 22 2005, @04:55AM (#12311410) Homepage Journal
    SMP system performs better when applications are multithreaded...

    (Dual core is the same as an SMP system, except the cores can communicate a bit faster with each other)

    • yeah it is a bit dull to anyone who has been following dual core or even just SMP.

      Cue loads of stupid people saying this technology is crap because most apps aren't multithreaded and the clockspeeds are lower than their current cPU.
      • The fact is, when you have only one problem to solve, a single fast CPU is always better than an equivalent count of slower CPUs. One 1GHz CPU is better than 2 512MHz CPUs of the same design, for solving a single problem.

        It's another fact that it's easiest for humans to analyze only one problem at a time. So the most straightforward way to handle any computing task it to crunch at it linearly from beginning to end.

        Unfortunately, gains in raw CPU speed have always come slower than the demand for number-cru
        • when you have only one problem to solve

          That highly depends on the problem. If your problem is highly parallizable and the application that resolves your problem has been written (correctly) in a multithreaded way, then two CPU's will perform better. (As you say, it doesn't scale in a linear way)

          Of course, you might just say that a parallizable problem is not one problem, but many small problems that need to be solved separately ;-)

        • And again unfortunately, the only way to get good performance out of those designs is by explicitly coding for parallel processing. And that's hard for humans.

          Yes its hard now but it won't always be hard, we will develop tools and methodologies that help us, just as we have before.
        • While that's true, these dual core chips (especially Intel's lame single-memory bus design) really seem targeted towards the desktop market where the impact is greatest, yet cost differential is realativly small (in relation to the total price of a system including software.)

          I'm more interested in what IBM's Cell processor can do. While some problems are definately single threaded by nature, the majority are not. I have a GIS application that could definately benefit from as many processors as I can throw
    • Re:Newsflash... (Score:5, Insightful)

      by beezly (197427) <beezly@beezly.or ... k minus language> on Friday April 22 2005, @05:07AM (#12311443) Homepage
      That is not necessarily true.

      I know this article is talking about Intel dual core chips, but for well-designed CPUs with integrated memory controllers (Power5, Ultrasparc IV, Opteron), the difference between a single dual-core CPU and two single-core CPUs is significant.

      On chips with built in memory controllers, as you increase the number of cores on a chip the memory bandwidth per core decreases, however as you increase the number of chips in a system, the memory bandwidth per core remains the same and the number of cores increases.

      That can amount to a big performance difference when running memory-intensive jobs.

      Intel seem to be really losing the plot here at the moment. In multichip configurations, Intel's memory bandwidth already sucks compared to Opteron. Multicore per chip is only going to make it FAR worse.
      • Is it just me or did the performance of the single core AMD relative to the Intel dual core in those benchmarks just scream out..."I want the AMD dual core!"?

        Seriously, unless you're application can run in the cache on the Intel parts, the AMD is gonna win hands down when running at the same clock rate which translates pretty closely to the same power consumption. AMD will yet be a tad lighter on power consumption just because the stuff is packed more tightly even though it has more active components. Equa
        • Dual Bus memory still doesn't cut it.

          Each Opteron has a dual bus memory controller on board (granted, only DDR400)... but as I increase the number of CPUs, the number of memory busses increases.

          The 4 CPU boxes I'm working on have 8 independent DDR400 memory busses.

          The only obvious way to get your memory bandwidth to scale is to have your memory controllers per CPU (or even better, on the CPU itself).
        • That's NUMA - if your OS is NUMA aware it should try to place processes on the same processor as the memory that contains their data.

          But yes, you're right, processes accessing memory on a different processor will suffer a latency (and to some extent bandwidth) hit. A well designed OS will help to mitigate it to some of the extent, but it's one of the reason that CPUs don't scale linearly.
    • Re:Newsflash... (Score:5, Informative)

      by The New Andy (873493) on Friday April 22 2005, @05:11AM (#12311452) Homepage Journal
      ... assuming the OS chooses which threads are executed on which core well enough. If two threads depend on each other heavily and they are running on different cores, you can get really crappy performance.

      So the obvious answer would be to move one of the processes to the other core. However, this isn't trivial. You either have one scheduler per core or one scheduler per operating system. (You can't have a single thread sent to both cores easily - if both cores run the same thing at the same time there will be chaos)

      If you have one per core, then the scheduler trying to get rid of the thread will have to synchronise with the other core, waiting for the other scheduler to come into context, it then has to tell it to add this new process. Obviously, there is a fair bit of overhead, and if my memory serves me correctly, each core in the current chip has its own cache - so now all the stuff which was cached has to be sent to memory (since it is in the wrong cache) and now there is nothing in the cache, making every memory access slow for the next little while. End result - you can transfer a thread between CPUs, but it is costly.

      It is possible to have a single scheduler which can then just dispatch threads to each core as it gets run by each core. The big one here is making the scheduler threadsafe - both CPUs could run the scheduler at the same time, so you have to make sure they don't crap on each other. This is a problem which we have solved already with common synchro-primitives. But, if you just lock the list of threads to run (*), then you will get a whole lot of CPU time wasted just waiting to run the scheduler. It might be acceptable for 2 cores, but it doesn't scale at all.

      (*) You may realise (just as I realised) that a scheduler is more than just a list of threads to run (it is typically implemented as a couple of lists for each priority). The same problem still occurs with more than one list of threads, it is just a bit harder for me to express (proof by bad English skills).

      Finally, I'm expecting someone to tell me that I'm wrong about something I just said. That person is probably correct. My only experience with this stuff is a 3rd year undergrad operating systems course where we played around with OS161 (a toy operating system basically). But, hopefully the end conclusion will be the same: twice the number of processors won't equal twice as much performance, and it is tough to get a fast algorithm that will scale.


      • You know, in Windows, you can hit Ctl+Alt+Del to bring up the task manager, go to the processes tab, right click on a process, and go to "Set Affinity".

        Just sayin'...
      • Re:Newsflash... (Score:4, Informative)

        by dascandy (869781) <dascandy@gmail.com> on Friday April 22 2005, @12:14PM (#12314706)
        Actually, the opposite. Two processes that communicate quite heavily SHOULD be run together on two processors, especially since they will share memory and thus cache lines, plus they can spend time spinning on a lock instead of swapping threads. Given short locks and equal speed, they can work a whole lot more efficient on a dual-core than on a single-core.

        FYI, it's called Gang Scheduling and has been described for quite some time.
  • A matter of time. (Score:5, Insightful)

    by Renraku (518261) on Friday April 22 2005, @04:57AM (#12311417) Homepage
    How long before applications start figuring that they should have an entire core dedicated to them?

    Windows, for example. What if the next version of Windows requires a dual-core processor to be usable? You know..Windows gets one core to idle at 80% of its capacity..and spills over into the other core when loading a text file.

    If things stayed the way they were now, and the entire other core could be kept separate from the OS and used for gaming/other applications, it would be a great idea.

    But guess what.
    • There's no way you can dedicate a CPU to a particular application.. not in any form of pre-emptive OS.

      However, you can constrain an application to a particular CPU (in windows at least) - task manager, set affinity. That's a great way of preventing an application from using your other CPU. If you want a CPU to run a game only, you would have to go through the entire process list and set the other processes to CPU 1, (or write an app to do that), and then set your game process to CPU 2.

      I think you'll get m
    • Windows is currently idling at between 4 and 7% of CPU on my PC. Admittedly I'm running a 2.66GHz machine, but then again, according to the processes list that CPU tiem is mostly being taken up by a couple of apps sitting in the background (and 1% by me typing in IE).

      Why on earth you would want to allocate an entire CPU to that, I have no idea.

      Now you might want to allocate a whole CPU to Doom3 or HL2, but I suspect they'll pretty much get that anyway, as applications are assigned to the quietest CPU, as
    • by jtshaw (398319) * on Friday April 22 2005, @07:39AM (#12311980) Homepage
      That would be a total waste of CPU time.

      Very few applications, and OS's in particular, are idle most of the time. I don't know the exact profiling characteristics of Windows, but I do know that in linux the kernel rarely, if ever, takes up 100% of a CPU's, and never does for a prolonged period of time.

      If you locked one CPU and made that for OS tasks only you'd be wasting a lot of clock cycles that another application could happily use. Same would go for locking just about any application to a cpu.
  • Well... (Score:5, Interesting)

    I'm still bumming around with a sub-gigahertz chip, specifically an Athlon T-Bird. I've been out of the loop for too long, can anyone tell me the benifits of using a dual core system (and while we are at it, a 64-bit chip)? Any problems to look out for if I decide to jump on the wagon in my next upgrade?
    • For the average user , if i were to be totaly honest. Right now there is hardly any real use for either. Duel cores would probably help the system apear faster if the average user is switching around alot of programs ubt for the price you would pay then it is not worth your time.

      64-bit well um if the average user um well runs a massive database setup but it will be more usefull soon in the x86 world (athlon 64 procesors though are excelent because of the onboard memory controler and architecture).

      For the
      • Re:Well... (Score:5, Informative)

        by tomstdenis (446163) <tomstdenisNO@SPAMgmail.com> on Friday April 22 2005, @05:22AM (#12311497) Homepage
        AMD64 carries more than just "bigger registers". It has more of them and the actual core is an overall improved K7 process with

        - Slightly longer scheduling buffers
        - 128-bit L1 cache bus
        - Larger instruction window (means it can feed the alus better when constants/etc are found)
        - more registers [and they're bigger]

        They also run cooler and takes less power than their k7 brothers.

        Tom
          • Re:Well... (Score:5, Informative)

            by tomstdenis (446163) <tomstdenisNO@SPAMgmail.com> on Friday April 22 2005, @05:39AM (#12311542) Homepage
            .... Me work for AMD? Ha!

            No, I'm just a happy loyal user. I have both a Prescott P4 3.2Ghz and an AMD64 Newcastle 2.2Ghz...

            For what I do [building software] the AMD64 smokes the P4 ... and does it without getting to 50C or so...

            The AMD approach is just common sense. Be more efficient at what you do and gradually do it faster. Intel went the market route and said "slow clockrate is for pansies!".

            So you end up with a cpu that has a higher clock rate but it doesn't win because the efficency is too low.

            AES on my AMD64 ranges around 260 [or so] cycles/block. On the P4 with Intels compiler I get around 410 cycles/block. If you scale 3.2Ghz to 2.2 Ghz that's still effectively 281 cycles [at 2.2Ghz]. Doesn't seem like much but keep in mind to get this speed they had to draw more power and run at a higher clock rate.

            I did a benchmark a week ago where I built LibTomCrypt with/without hyperthreading and it took the prescott with hyperthreading at 3.2Ghz to even come close to matching the AMD64 speed. That's only on ~45,000 lines of code.

            Now multiply that by say five or ten to get a larger project.

            I'm not saying the Prescott isn't a neat design. Overall it's efficient enough to be useful. Just the AMD64 eats it's breakfast and spanks it's mother is all I'm saying. ;-)

            Tom
    • Mainly the difference would be found when running many apps at once. For example if you are ripping songs and playing a game simulataneously then it would be faster than a single core machine. I run many programs like proxy servers, mail servers etc. for the home LAN and also use it for games. So in this situation dual core will help me run the game lag free.

      Although for the speed boost to materialize in games they will have to be coded to use both cores, so one dosent just idle away.. When more programs g
    • Re:Well... (Score:5, Informative)

      by JollyFinn (267972) on Friday April 22 2005, @05:28AM (#12311514)
      The 64bit is for anyone with more than 2Gb of RAM + x86-64 gives you more registers besides being 64bit so it speeds up the recompiled code.

      Dual core means simply you have TWO processors running. Rember old reviews on SMP dual celeron A and other such reviews. It gives little for games, lots for certain multithreaded applications. As you have two processors running and doing things. And multitasking applications, like being able to run interactive application (doom 3), while system is doing some multihour compilation on background.
      Anyway, it mainly keeps system more responsive when you have some thread or application takes CPU.
      Also with lesser degree helps in some other similar situation, where CPU is tied up with something EXACLY same moment you would wan't it to deal with UI stuff.
    • Re:Well... (Score:2, Informative)


      Hi, my older PC was a T-Bird@850Mhz with 256 RAM, 160GB HDD- PATA133 (CPU was working at 50 deg C)

      Now I have Athlon64 3000+ (233x8 = 2000MHz) (s939) with 1GB RAM, 200GB HDD- SATA150 (CPU does not go beyond 37 deg C)

      The difference is that with the older pc I compiled and installed LinuxFromScratch in 4 days (well I drank a lot of caffeine products),
      while when I switched to the A64 PC I did the job IN ONLY 4 Hours!

      Unfortunately I was unable to compile a stable x86_64 toolchain to complile a x86_64 Linux
  • Well? (Score:5, Funny)

    by Anonymous Coward on Friday April 22 2005, @05:03AM (#12311431)
    Does this mean my Windows XP machine wont pause when I put in a floppy or Cdrom? Wow, sign me up.
    • Re:Well? (Score:5, Funny)

      by EpsCylonB (307640) <eps@epsc y l o nb.com> on Friday April 22 2005, @05:07AM (#12311444) Homepage
      no but it will make your internet faster.
    • Does this mean my Windows XP machine wont pause when I put in a floppy or Cdrom? Wow, sign me up.

      Nope, that feature isn't scheduled until after we have 16 cores on chip, 32 Gb of RAM, 10 Terabytes of HD storage, and optical media is at 1 Terabyte per disc. They said it was a wierd hardware limit and it would require at least that much processing power for Windows XP to read a floppy or CDROM and do anything else. You don't even want to know what it will take for Longhorn to do that.
  • Something missing (Score:5, Interesting)

    by FidelCatsro (861135) <fidelcatsro@NosPAm.gmail.com> on Friday April 22 2005, @05:04AM (#12311438) Journal
    What this test really was missing was a direct comparison to SMP systems which really for me makes the results entierly boring and expected .
    If he had shoved in a duel opteron set-up and a duel xeon set-up then it may have been a little more intresting , though as it stands its like stating the obvious.
  • Anandtech (Score:5, Informative)

    by iamthemoog (410374) on Friday April 22 2005, @05:20AM (#12311488) Homepage
    Has the new dual core opteron up against a quad Xeon with 8MB cache, amongst many others.

    Well worth a read:

    http://www.anandtech.com/cpuchipsets/showdoc.aspx? i=2397 [anandtech.com]
    • Absolutely it is worth a read. A dual dual-core AMD performs substantially better than a quad-Xeon, especially in the database tests. The AMD duallies also creamed the Intel duallies in every test.

      For the first time, AMD indisputably has the upper hand in the mid-high end server market. If AMD can ship in volume, Intel will be very worried.

  • One thought I had... (Score:5, Interesting)

    by Kjella (173770) on Friday April 22 2005, @05:40AM (#12311548) Homepage
    ...and I'm not quite sure if it's a good one, but for desktops:

    The foreground program has a dedicated core. If you switch programs, put the old on the "other" core. The new moved from the "other" core. Essentially, your current program has full responsiveness (assuming you don't do things that lock up the application itself), no context switches, no other programs that can run some weird blocking call (on a single core machine, it certainly looks that way at least, especially CD-ROM operations).

    Granted you could end up with your fg processor being idle most of the time. But the way many people work with the computer, the foreground program is the ONLY time-critical application.

    Kjella
  • by Anonymous Coward
    Or do I have to wait for Service Pack 3?

    Yours,

    Gator Fan.
  • Sluuuurp..... (Score:3, Interesting)

    by Diakoneo (853127) on Friday April 22 2005, @06:00AM (#12311613)
    That last page raised my eyebrows. 291 Watts under load, that's some serious power draw compared to what I'm used to. And that had to be kicking out some serious heat, too.
    Anybody know what is the draw for a 4x Xeon system? I'd be interested in seeing how they compare.
    I wonder at what point the facilities people will want to use the server farm to heat the building, too. A weird convergence, the PC world is becoming more like the old mainframe world.
  • by Aqua OS X (458522) on Friday April 22 2005, @06:01AM (#12311616) Homepage
    Who the hell runs benchmarks with FireFox and iTunes.

    if you ask me, the people that desperately need the ability to multitask are folks in the creative industry. Every 5 minutes bounce back and forth between massive applications rendering huge files.

    Nothing sucks more then opening a 400dpi photoshop document and not having InDesign respond since your single core CPU is being bogarted.

    SMP is probably the only reason I still find my crusty old Dual 450 g4 useful. It does things slowly, but it doesn't "feel" slow. If something is taking its sweet ass time, I can usually do something else without waiting years for windows and menus to draw.
    • Surely those kinds of apps are more memory bound than CPU bound?
    • by Jameth (664111) on Friday April 22 2005, @07:30AM (#12311921)
      You're dead on accurate with that one. I want a benchmark that will tell me what kind of performance I can expect if I have a logo I am editing in Illustrator that I open in Photoshop to clean up a bit and then insert into a document in InDesign while I'm trying to make it look similar in the webpage I'm putting together in TextPad, viewing both final documents through Acrobat, IE, FireFox, and Safari, all at the same time. (While listening to music.)

      And no, I'm not being sarcastic. Although I rarely do all of that at once, it has been known to happen. And don't even get me started about what happens when I have something compiling behind all of that. I'm just thankful, in a way, that since I don't do 3D work I'm not tossing Maya into that mix.
  • by pmadden (209229) on Friday April 22 2005, @06:17AM (#12311658) Homepage Journal
    I'll probably get flamed for this....

    Increased performance in CPUs has normally come from faster clock rates and more complex circuitry. As we all know, Intel (and the others) have bailed out on faster clocks. If you add more complex circuitry, the logic delay increases--to keep the clock rate up, you have to burn power.

    What does this mean? The old-fashioned ways of getting more performance are dead--if you try it, the chip will burn up. It's easier to build two 1X MIP cores than one 2X MIP core. Like it or not, dual cores are the only solution; with transistor scaling, we'll have to go to 4, 8, and 16 cores in the next few years. IBM went dual-core with the PowerPC in 2001. Intel, AMD, and Sun are just following suit.

    Not bummed out yet? Massive parallelism works well for people doing scientific computing, but for the average joe, it's useless. I don't care how fast a processor is--I usually have one task that will crush it--but rarely do I have two time-critical things to worry about at the same time. In the article referenced, they had to work hard to find things that would test the dual-core features. Parallel computing and multiple cores sounds great. History buffs will know about Thinking Machines, Meiko, Kendell Square, MasPar, NCUBE, Sequent, Transputer, Parsytec, Cray, and so on.... Not a happy ending.

    So.... we can't get more single processor performance without bursting into flames. And parallel machines are only useful to a small market. IMO, it's gonna get grim. (And before anyone says new paradigm of computing to take advantage of the parallel resource, put down the crack pipe and think about it--we've been waiting for that paradigm for about 40 years. Remember occam? I thought not.)
    • If they can't ramp up hardware any more, the next revolution in computing will not be faster hardware, it will be cleaner, more efficient code. Personally, I think that there's a lot of potential left, if not with silicon, then with diamond wafer chips, or optical computing.
    • by ciroknight (601098) on Friday April 22 2005, @10:33AM (#12313657)
      Well you're right about what you were saying, those words would arbit a good deal of flames. But everything has its place and there's a place for everything. Lemme explain.

      Clockspeed is the easiest race, if you want to think of the CPU industry as a continuous race. All you have to do to crank out a faster CPU is continually shrink the die (because smaller gates flip faster), and make sure that everything is arranged neatly on the chip. When you hit thermal walls like we are now, it's simply time to reduce the voltage, and shrink the die again.

      The only problem is, Intel's flagship for doing this now, happens to be one with a lot of baggage. The Netburst core design pretty much dictates there is to be at least two of everything, and both of them should be running all the time, especially if Hyperthreading is on. This effectively doubles your transistor count (though in reality it is less than that; there's only a single copy of bus administration, micro-op decode, etc). Keeping them on all of the time also helps jump the heat production.

      But here's a truth; their CPU clock game could still be running if they would like it to. The Pentium-M is still running extremely cool. Shrink it to a 90 micron core, use SOI, strained silicon, more of their substrata magic, and a healthy dose of SpeedStep, and you could see a Pentium-M hitting 3.5GHz clockspeeds that would put both the Athlon 64 and the Pentium 4 to shame. Sadly, to build this processor is to admit defeat with the Netburst core, and Intel's being very stubborn.

      On the other hand, I believe AMD's got some magic they haven't used yet up their sleeve. Though honestly I couldn't tell you what it is. There has to be a reason they aren't playing up the Turion more other than the fact it isn't scaling down as far as the Pentium-M can. I'm also surprised they're being so slow about ramping their clockspeeds, but this is probably just so their thermal profiles look superior to Intel's. A 3GHz Opteron could easily decimate a dual Xeon setup, but at the same time would probably produce just as much heat, and I think AMD would see that as a defeat.
    • Interesting issues, but I disagree with you on some points.

      First, there's really nothing new about multiple cores IMO. Instead of "core" try the term "execution unit". CPUs used to have a single execution unit. Then something like an Integer Unit was added. Remember the days of math coprocessors? They soon got moved into the CPU by adding more, and more specialized, Integer Units and a Floating Point Execution Unit. Now you have CPUs that have multiple execution units, to the point of having multiple Integ
      • Yeah, actually I think this might become a big boon for gamers (such as myself). The stuff that makes games interesting to me is AI (which is a very wide field certainly, but I think of such things as finally being able to use reasoning-engines (F.E.A.R is the first game I know of that use one), better pathfinding (AI can now use focussed D* instead of cheating with A*, etc) all of which will finally get to some love and tender care.

        With only one CPU, AI was always the ugly step child. "Yeah, sure.. we ca

  • I feel a good use for Dual-core systems is to put the OS on one core, including all explorer.exe instances and threads.

    The operating system shoul employ a smart system of monitoring CPU usage per thread and move the high- usage threads to the other core.

    I wonder though, on a slightly different topic - heat dispersion: nobody seems to talk about it - but two cores mean twice as much heat. How the hell do they do away with the heat? It dissapointing but they might be speedstepping/downclocking the cores
  • It's worth noting that each scenario consists of only desktop applications, and it'd still be interesting to see some common server benchmarks, such as a database or web server.


    Except that this is a desktop processor, that won't be shipping in server systems. So in actual fact it's worth noting that the entire point is that each scenario consists of only desktop applications.
    • In general and assuming a non broken cache architecture, a 2CPU/core solution will feel faster than a single cpu solution with twice the cpu frequency.
      The total number of cpy cycles are the same, but the average queue-length for a process waiting for the CPU is half, i.e. the latency before your process is scheduled is lower making it "feel" faster.
    • If the article is correct, Intel could build in a coffee pot for those long nights of full load modeling.

      Ah, this would explain why Asus [asus.com] released a barebones solution called S-presso. Just add a couple of dual cores, water cooling, a fine italian pump, and poof the next generation in computers the e-s-presso. As the water travels over one each dual core chip it's super heated quickly and the italian pump takes over. Through the grounds to your demitasse mugs.

      Warning, not drinking enough espresso may