Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Graphics Software Hardware

NVidia Accused of Inflating Benchmarks 440

Junky191 writes "With the NVidia GeForce FX 5900 recently released, this new high-end card seems to beat out ATI's 9800 pro, yet things are not as they appear. NVidia seems to be cheating on their drivers, inflating benchmark scores by cutting corners and causing scenes to be rendered improperly. Check out the ExtremeTech test results (especially their screenshots of garbled frames)."
This discussion has been archived. No new comments can be posted.

NVidia Accused of Inflating Benchmarks

Comments Filter:
  • by binaryDigit ( 557647 ) on Thursday May 15, 2003 @09:08AM (#5963235)
    Isn't this SOP for the entire video card industry? Every few years someone gets caught targeting some aspect of performance to the prevailing benchmarks. I guess that's what happens when people wax on about "my video card does 45300 fps in quake and yours only does 45292, your card sucks, my experience is soooo much better". For a while now it's been the ultimate hype driven market wrt hardware.
  • I don't know (Score:5, Insightful)

    by tedgyz ( 515156 ) on Thursday May 15, 2003 @09:10AM (#5963247) Homepage
    I have read two reviews on AnandTech [anandtech.com] and [H]ardOCP [hardocp.com]. Neither of them made any such accusations. They both said visual quality was fine.

    Targeting benchmarks is just part of the business. When I was on the compiler team at HP, we were always looking to boost our SPECint/fp numbers.

    In a performance driven business, you would be silly not to do it.
  • whatever (Score:2, Insightful)

    by JeffSh ( 71237 ) <jeffslashdot AT m0m0 DOT org> on Thursday May 15, 2003 @09:10AM (#5963251)
    I looked at the photos, and it seems to me to be just a driver fuckup on the 3dmark benchmarks.

    Since when did rendering errors caused by driver problems become "proof" of a vendor inflating benchmarks?

    And this story was composed by someone with the qualifications of "Website content creator, who likes video games alot" not a driver writer, not anyone technically inclined beyond the typical geek who plays alot of video games and writes for a website called "EXTREME tech" because you know, their name makes them extreme!

    note: I'm not an Nvidia fanboy, i just bought an ATI Radeon 9500, so I am just a skeptic of incredulous, idiotic derivations of fact, when all he has are some screenshots of a driver screwing up the render of a scene.
  • by Anonymous Coward on Thursday May 15, 2003 @09:15AM (#5963289)
    "Because nVidia is not currently a member of FutureMark's beta program, it does not have access to the developer version of 3DMark2003 that we used to uncover these issues."

    Wow, some prelease software is having issues with the new brand-new drivers? Who would have thought... Why not wait for official release of the software and the drivers before making hasty conclusions?

    In addition, who really cares about 3DMark? Why not use time which is wasted on 3DMark benchmark for benchmarking real games? After all 60fps tells a lot more about performance than 5784 3DMarks.
  • Re:Yeah well... (Score:3, Insightful)

    by truenoir ( 604083 ) on Thursday May 15, 2003 @09:15AM (#5963290)
    Same deal with Tom's Hardware. They did some pretty extensive benchmarking and comparison, and the 5900 did very well in real world games (to include the preview DOOM III benchmark). I'm inclined to believe the driver problem nVidia claims. Especially since it's nVidia and not ATI, they'll likely fix it quickly (not wait 3 months until a new card comes out...not that I'm still bitter about my Rage Fury).
  • by Surak ( 18578 ) * <(moc.skcolbliam) (ta) (karus)> on Thursday May 15, 2003 @09:16AM (#5963297) Homepage Journal
    Goodbye, karma. ;) And, realistically, what does it matter? If two cards are similar in performance, but one is just a little bit faster, in reality it's not going to make *that* much of a difference. You probably wouldn't even notice the difference in performance between the new nVidia card and the ATI 9800, so what all the fuss is about, I have no clue.
  • Sigh... (Score:3, Insightful)

    by Schezar ( 249629 ) on Thursday May 15, 2003 @09:18AM (#5963310) Homepage Journal
    Back in the day, Voodoo cards were the fastest (non-pro) cards around when they first came out. A significant subset of users became Voodoo fanboys, which was ok, since Voodoo was the best.

    Voodoo was beaten squarely by other, better video cards in short order. The fanboys kept buying Voodoo cards, and we all know what happened to them ;^)

    GeForce cards appeared. They were the best. They have their fanboys. Radeon cards are slowly becoming the "other, better" cards now.

    Interesting....

    (I'm not sure what point I was trying to make. I'm not saying that nVidia will suck, or that Radeon cards are the best-o. The moral of this story is: fanboys suck, no matter their orientation.)
  • by Anonymous Coward on Thursday May 15, 2003 @09:20AM (#5963324)
    They all do it. You just need a proper NT style WHQL test, which tests each pixel out of the output to make sure it's rendered according to spec. Would you believe this isn't done, so all the tests carried out by manufacturers tell you how quickly `something` was rendered?
  • by BenjyD ( 316700 ) on Thursday May 15, 2003 @09:21AM (#5963338)
    The problem is that people are buying cards based on these silly synthetic benchmarks. When performance in one arbitrary set of tests is so important to sales, naturally you're going to see drivers tailored to improving performance in those tests.

    Of course, if Nvidia's drivers were released under the GPL, none of the mud from this would stick as they could just point to the source code and say "look, no tricks". As it is, we just get a nasty combination of the murky world of benchmarks and the murky world of modern 3D graphics.
  • by JDevers ( 83155 ) on Thursday May 15, 2003 @09:26AM (#5963367)
    Well, to tell you the truth...I LIKE application specific optimization as long as it is general purpose enough to be applied across the board to that application. However, in this case, the corners are cut in a benchmark and are targetted SPECIFICALLY at the scene as rendered in the benchmark. If ATI had done the same thing in Quake, the pre-recorded timedemos would be faster, but not actual gameplay...that wasn't the case, the game itself was rendered faster. The only poor choice they made was how they recognized that Quake was what was being ran, optimizing a specific rendering path would have been more general purpose and have seemed a lot less like cheating.

    This on the other hand, if true, could be construed as NOTHING BUT cheating. Especially when coming from a company that said they didn't support 3Dmark 2003 because it was possible for companies to optimize their drivers specifically FOR such benchmarks...well, they proved their point.
  • by Pulzar ( 81031 ) on Thursday May 15, 2003 @09:33AM (#5963421)
    Why do you feel obligated to post the "I don't care about the zillion fps in quake"? Do you post a similar message to every story that you don't care about?

    This is a big deal to people who care -- it insults the reviewers who spent hours benchmarking their card, and it insults the users who bought/will buy their card. There are people who care, and people who do want the fastest card for a reason, and they are interested to hear from other people who care, and not the people who don't!
  • by newsdee ( 629448 ) on Thursday May 15, 2003 @09:35AM (#5963442) Homepage Journal
    Now, cards are tweaked towards improved performance within a particular benchmark

    This is always the case with any chosen performance measurement. Look at managers asked to bring quarterly profits. They tend to be extremely shortsighted...

    Moral of the story: be very wary on how you measure and always add a qualitative side to your review (e.g. in this case, "driver readiness/completedness").

  • by Anonymous Coward on Thursday May 15, 2003 @09:35AM (#5963444)
    This is why all software and hardware should be open-source.

    Right, and why all your bank records should be public (just in case you are stealing and have illegal income). And all your phone records should be public as well as details of your whereabouts (just in case your cheating on your wife/skipping class). And of course, why the govt should have access to all your electronics transmissions (internet, cell, etc), just in case you're doing something that they don't like.
  • Random Rail (Score:1, Insightful)

    by Anonymous Coward on Thursday May 15, 2003 @09:40AM (#5963477)
    How difficult would it be to have a "random rail" generator? This would be fair for review purposes, just generate a "random rail" path for the specific review and run the benchmark with each card. This is essentially what they did to discover the "driver bug" anyway, so why not make that a 3D benchmark feature?

  • by SubtleNuance ( 184325 ) on Thursday May 15, 2003 @09:49AM (#5963562) Journal
    ATi does make better hardware but their software (drivers) are terrible and not very well supported.

    that is a old accusation - that had a kernel of truth 24 months ago, but Ive used ati cards for years, and they have gone rock solid since forums like this just started to accept that schlock as 100% truth.

    Bottom line: dont believe the hype. this is just *not* true.
  • by Kegetys ( 659066 ) on Thursday May 15, 2003 @09:53AM (#5963599) Homepage
    I would suspect something like this too... I'm not a 3D card expert, but from what I understood the way the "cheating" was found was by stopping the whole scene, freezing everything going on (including all processing of culling information). When you then start rotating the camera around, you are supposed to get rendering anomalities, since the scene is optimised to be viewed from a different angle. Why this happens with the geforce only I dont know, but I would guess that its because nvidia and ati drivers and cards work very differently since they are designed by very different people. Though of course, it is possible that nvidia would be "cheating" in driver level, but before doing that kind of accusations they should get solid proof, and especially let nvidia give their own explanation first. Then again, if this is a feature, happening because of some advanced optimization in the card/drivers nvidia propably doesnt want to give an accurate explanation since that would reveal the method for its competitors to use.
  • Yeah, but they all do it, and it isn't strictly video board manufacturers either. That '80 GB' hard drive you just bought isn't 80 GB, it's (depending on the manufacturer) either a 80,000,000,000 byte hard drive or a 80,000 MB hard drive...either way it isn't by any stretch of imagination 80 GB. That Ultra DMA 133 hard drive, BTW, can't really do a sustained 133 MB/s transfer rate either, that's the burst speed and you'll probably NEVER actually achieve that transfer rate in actual use. That 20" CRT you just bought isn't 20", it's 19.2" inches of viewable area. A 333 MHZ FSB isn't 333 MHZ, it's 332-point-something mhz, and even then it isn't really 333 MHZ because it's really like 166 mhz and doubled because DDR memory allows you to read and write on the high and low side of the clock. That 2400 DPI scanner you just bought is only 2400 DPI with software interpolation. Your 56K modem can really only do 53K due the FCC regulations requiring them to disable the 56K transfer rate. The list goes on.
  • by satch89450 ( 186046 ) on Thursday May 15, 2003 @10:00AM (#5963652) Homepage
    [Nvidia] used to be great.. but now i have my doubts

    Oh, c'mon. Benckmark fudging has been an on-going tradition in the computer field. When I was doing computer testing for InfoWorld, I found some people in a vendor's organization would try to overclock computers so they would do better in the automated benchmarks. ZD Labs found some people who "played" the BAPco graphics benchmarks to earn better scores by detecting a benchmark was running and cutting corners.

    <Obligatory-Microsoft-bash>

    One of the early players was Microsoft, with its C compiler. I have it from a source in Microsoft that when the Byte C-compiler benchmarks figures were published in the early 1980s Microsoft didn't like being back of the pack. "It would take six months to fix the optimizer right." It would take two weeks, though, to put in recognizers for the common benchmarks of the time and insert hand-optimized "canned code" to better their score.

    </Obligatory-Microsoft-bash>

    Microsoft wasn't the only one. How about a certain three-letter company who fudged their software? You have multiple right answers to this one. :)

    When the SPECmark people first formed their benchmark committee, they knew of these practices and so they made the decision that SPECmarks were to be based on real programs, with known input and output, and the output was checked for correct answers before the execution times would be used.

    And now you know why reputable testing organizations who use artifical workloads check their work with real applications: to catch the cheaters.

    Let me reiterate an earlier comment by Alan Partridge: it's idiots who think that a less-than-one-percent difference in performance is significant. (Whether you the shoe fits you is something you have to decide for yourself.) What benchmark articles don't tell you is the spread of results they obtain through multiple testing cycles. When I was doing benchmark testing at InfoWorld, it was common for me to see trial-to-trial spreads of three percent in CPU benchmarks, and broader spreads than that with hard-disk benchmarks. Editors were unwilling to admit to readers that results were collected that formed a "cloud" -- they wanted a SINGLE number to put in print. ("Don't confuse the reader with facts, I want to make the point and move on.") I see that in the years since I was doing this full-time that editors are still insisting on "keep it simple" even when it's wrong.

    Another observation: when I would trace back hardware and software that was played with, the response from upper management was universally astonishment. They would fall over backwards to ensure we got a production piece of equipment. To some extent, I believed their protestations, especially when bearded during their visits to our Labs. One computer company (name withheld to protect the long-dead guilty) was amazed when we took them into the lab and opened up their box. We pointed out that someone had poured White-Out over the crystal can, and that when we carefully removed the layer of gunk the crystal was 20% faster than usual. Talk about over-clocking!

    So when someone says "Nvidia is guilty of lying" I say "prove it", further saying that you have to show with positive proof that the benchmark fudging was authorized by top management. I can't tell from the article, but I suspect someone pulled a fast one, and soon will be joining the very long high-technology bread line.

    Pray the benchmarkers will always check their work.

    And remember, the best benchmark is YOUR application.

  • by onion2k ( 203094 ) on Thursday May 15, 2003 @10:16AM (#5963794) Homepage
    So, because he isn't interested in this boring, repetative, inane and stupid ego-massaging 'my computer is more 1337 then yours' willy waving competition his opinion is invalid?

    The trouble with free speech is that everyone has it.
  • by satch89450 ( 186046 ) on Thursday May 15, 2003 @10:19AM (#5963817) Homepage
    Nobody would test an FPU based on how many times per second it could take the square root of seven.

    Really? Do you write benchmarks?

    I used to write benchmarks. It was very common to include worst-case patterns in benchmark tests to try to find corner cases -- the same sort of things that QA people do to try to find errors. For example, given your example of a floating-point unit: I would include basic operations that would have 1-bits sprinkled throughout the computation. If Intel's QA people would have done this with the Pentium, they would have discovered the un-programmed quadrant of the divide look-up table long before the chip was committed to production.

    Why do we benchmark people do this? Because we are amazed (and amused) at what we catch. Hard disk benchmarks that catch disk drives that can't handle certain data patterns well at all, even to the point of completely being unable to read back what we just wrote. My personal favorite: how about modems from big-name companies that drop data when stressed to their fullest?

    The SPECmark group recognizes that the wrong answer is always bad, so they insist that in their benchmarks the unit under test get the right answer before they even talk of timing. This is from canned data, of course, not "generating random scenes." The problem with using random data is that you don't know if the results are right with random data -- or at least that you get the results you've gotten on other testbeds.

    Besides, how is the software supposed to know how the scene was rendered? Read back the graphics planes and try to interpret the image for "correctness"? First, is this possible with today's graphics cards, and, second, is it feasible to try? Picture analysis is an art unto itself, and I suspect that being able to check rendering adds a whole 'nuther dimension to the problem. I won't say it can't be done, but I will say that it would be expensive.

    For FPUs, it's easy: have a test vector with lots of test cases. Make sure you include as many corner cases as you can conceive. When you make a test run, mix up the test cases so that you don't execute them in the same order every pass. (This will catch problems in vector FPU implementations.) Check those results!

    Now, if you will tell me how to extend that philosophy to graphic cards, we will have something.

  • by Maudib ( 223520 ) on Thursday May 15, 2003 @11:30AM (#5964522)
    So who cares? It matters little to me how fast something is in a synthetic benchmark if there is no correlation to real world applications, and I am sure Nvidia isnt doing this in games cause who would buy a card that didnt properly render most scenes.

    I dunno, but synthetic benchmarks seem a bit irrelevant as does what Nvidia does in them. Show me how many FPS it gets in Q3A, that I care about.
  • by The Ego ( 244645 ) on Thursday May 15, 2003 @12:00PM (#5964828)
    What you are describing isn't benchmarking, it's stress testing.

    Benchmarks are meant to predict performance. While it is essential to check the validity of the answer (wrong answers can be computed infinitely fast), the role of a benchmark isn't to check never-seen-in-practice cases or so-rarely-seen-in-practice-that-running-100x-slowe r-won't-matter.

    That reminds me of the "graphic benchmark" used by some Mac websites that compares Quickdraw/Quartz performance when creating 10k windows. Guess what, Quartz is slower, because Quartz windows are a lot more powerful/heavyweight than Quickdraw ones. But who gives a fuck, how often do you need to create 10k windows in a hurry ? No one, apart from those OS 9 zealots who are looking for ways to bash OS X. A realistic benchmark may to check to at most 10s of windows, but the conclusion would probably be that the difference in speed isn't observable by humans.

    A good benchmark can only be judged by comparing its execution profile against what users will run. If it's not reflecting the reality, it's not an appropriate prediction of the performance for the user. And it's not a binary property. While Spec is by definition perfect for anyone that only runs Spec, it is known and accepted to be imperfect at anything else, and a completely useless predictor in some cases (as in very low statistical correlation between Spec scores and speed at running Foo). It's just a "best effort" suite of tests for workstation applications. I'm talking SpecINT / SpecFP here, other Spec benchmarks exist because (gasp!) SpecINT/FP don't cover the whole computing spectrum.

    You also don't seem to have much of a clue about how processors are really tested. Guess what, the processors people do all that you describe and more, much more. All day long on many, many samples, for months on end, in good/bad conditions (thermal, electrical). It's just that no test suite can catch all the problems, so defects will always slip by. _Always_, even if the logic is formally proven correct, since processors aren't mathematical entities but subject to electrical / manufacturing variations. Even if no problem exists today on a given CPU, take a hundred of them from various batches, power-cycle them a few million times, run them for a few years in marginal conditions and check again.
  • STFU - who cares? (Score:2, Insightful)

    by FreakerSFX ( 256894 ) on Thursday May 15, 2003 @12:00PM (#5964835)
    Did you see what they had to do to "prove" the cheat? Read the article. In other game tests the card beats the ATI 9800PRO so obviously it is faster. (see anandtech, hardocp, tom's hardware, etc if you really care).

    The things that they're being accused of reduce work to the graphics engine - and doesn't affect image quality - it's called OPTIMIZATION. The fastest frame rate with the best image quality.

    Man someone must have spent hours in front of their computer coming up with a way to get a sensational story like this. ATI has done it, and so does everyone else but what sucks is that this "news" is being flogged everywhere like it's the most incredible piece of news ever.

    In this case it's not ANYWHERE NEAR as bad as changing the card's performance based on the name of the program that's being run - I think most people remember that one.

    In this case it's a non-story. And yes, we all pay too much attention to benchmarks. I am now one to two generations behind leading edge and plan to stay there. It's far less expensive than driving a new car of the lot every four months.
  • Re:Random Rail (Score:1, Insightful)

    by Anonymous Coward on Thursday May 15, 2003 @12:09PM (#5964946)
    Actually the solution is far easier than you think.

    Run the random rails, not really random (they never are anyways) but with a "random seed" number -- so you can generate your "random" path, type in the seed number, and the benchmark runs THAT fly path.

    Want someone else to replicate your results? Give them the seed number and let them test on the same path.

    Want to run the test w/ multiple cards? Just remember to write down your seed number.

    This's already done with a lot of the old "random world map/dungeon map" generators for D&D players. I can give 2-3 numbers to a friend with the same program and "poof", he generates the same dungeon.

    The only thing we'd have to be sure on is that certain seed numbers didn't get to be "common" such that the graphics card makers specifically optimized for those seeds -- but that's a bit of a stretch given that you can test with the "common" seed as well as 2-3 other random seeds just to keep them honest.
  • by aksansai ( 56788 ) <aksansai@gm a i l .com> on Thursday May 15, 2003 @12:11PM (#5964968)
    Video performance from my Radeon 7500 under Linux (using the ATI optimized drivers for XFree86 4.3) is not nearly as good as the ATI-provided drivers under Windows 2000. I think ATI gives the type of ingredients to the Linux driver developers, but the quantity of those ingredients it keeps to themselves.

    nVidia could really follow along this same philosophy, instead of hearing the massive complaints from their oft-buggy video driver.
  • One of the first courses in all college business curriculums I've seen is "Business Statistics" (BA154 here at GRCC [grcc.edu].).

    The course focuses on making decisions based on statistics. In the second week of class, we learned what a standard deviation was, and we never stopped using it throughout the semester.

    But perhaps ignorance would explain business tactics of the 90's.
  • by Hellkitty ( 641842 ) on Thursday May 15, 2003 @12:35PM (#5965215) Journal
    It is possible to stay on topic while adding more variables to the argument. Next time I will use more complete sentences to keep everyone focused.

    The point I was making is simply this - if they cheated or did not cheat on the benchmarks, does it really make a difference? For some, sure. But for me and probably a good chunk of people out there, the slight extra edge that NVIDIA may or may not have given themselves in this benchmark isn't going to be enough to make me run out and purchase the new geforce over the radeon unless I wanted to particpate in the "I have the fastest graphics card available as of 3:00 this afternoon" pissing contest. The few extra FPS nvidia can boast by rigging this benchmark will not help me become a better gamer, nor will it help most people become better gamers. So what's the point of becoming enraged over something like this? Even if you are one of the lucky few who can tell the difference between a great card and a slightly less great card, has this really altered your opinion so much of your choice of video cards?

  • by Dehumanizer ( 31435 ) on Thursday May 15, 2003 @01:04PM (#5965522) Homepage
    That's like saying you need crime so cops have a job... :(
  • by Oswald ( 235719 ) on Thursday May 15, 2003 @01:53PM (#5965940)
    One of us doesn't understand the article. The way I read it, the "optimization" the card is performing would only work on the benchmark game--the performance increase it yields will never be manifested in any real game, so is useless.

    I gather you read it differently?

  • by MobyDisk ( 75490 ) on Thursday May 15, 2003 @04:20PM (#5967332) Homepage
    IANAL.

    How would a driver downloaded from a different web site cause a liability for nVidia? Since the source is open, it would be easy to determine that it was not NVidia's code that caused the problem. Seems like (5) is an ADVANTAGE for nVidia, not a disadvantage.

    6) Speaking algorithmically, it is probably impossible to get that much improvement from a driver. In case you've never worked directly with 3D hardware before, this type of optimization is TOUGH. Open source is great for some things, but it would be difficult for them to so seriously outpace nVidia's development in this specialized area.
  • by Pulzar ( 81031 ) on Thursday May 15, 2003 @04:48PM (#5967609)
    First, faster video cards are not designed to make you a better gamer, they are designed to make your gaming experience better. If they are not doing that for you, then you're not playing the games that need the improvement, and you don't need the card. Which, I'm sure, is true for a lot of people out there.

    On the other hand, ATI sold over 1 million Radeon 9700s in first few months of it being out, so there are definitely a lot of people out there who do need and want the best card the money can buy.

    So, that gets us to your question of whether nvdia cheating really makes a difference. Obviously, it doesn't make a difference to you, because you don't want the buy any of the high-end cards in the first place. It should be obvious in the same way, though, that it does make a big difference to somebody who will buy a high end card.

    If 9800 and FX5900 have the same price, and speed is what you're after (and it should be, since you're buying these cards), then you want to buy the faster one. The only way to figure out which one is faster is to check the benchmark results (unless you buy both and try them tyourself). If one of the companies cheated in a benchmark, they have tricked you into thinking that you're buying a faster card, while you're really buying a slower one.

    Imagine you're picking between two equally expensive cars, and you want to buy the faster of the two. One claims to do 0-60 in 5s, and the other claims to do it in 3s. You'll go ahead and buy the latter one, only to learn later that they were testing the car going downhill while the other was accelerating on level ground! I think enraged would only begin to describe your reaction to that.

That does not compute.

Working...