Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Technology

Tile Based Rendering and Accelerated 3D 109

ChickenHead writes "AnandTech has put together a review of the Hercules 3D Prophet 4500 based on the new Kyro II chip from STMicro. What's unique about this particular chip is that it uses a Tile-based Rendering Architecture which results in a much greater rendering efficiency than conventional 3D rendering techniques. It is so efficient in fact, that the $149 Kyro II card clocked at 175MHz is able to outperform a GeForce2 Ultra with considerably more power and around 3X the cost of the Kyro II card. With games not able to take advantage of the recently announced GeForce3's feature set, the Kyro II may be a cheap solution to tide you over until the programmable GeForce3 GPU becomes a necessity." A very readable and interesting summary and an interesting technology and a potentially extremely cool video card.
This discussion has been archived. No new comments can be posted.

Tile Based Rendering and Accelerated 3D

Comments Filter:
  • increasing efficiency? shmuck! go read the article!!
  • The Windows drivers for the G200 improved dramtically - you can even play Half-Life (Counter-Strike) with a Matrox G200 (system specs: celeron 366@550mhz, 128mb, G200 AGP). Unfortunately they fixed their OpenGL drivers *WAY* too late as competition had passed them by (on the G200 at least). 4 fps sounds horrible - was this a G400?
  • If there is going to be support it is pretty much always release date + ~6 months. Developers have to get the card, figure out how to write a driver for it (if the company want release specs), and then write the driver... Unless this company is linux friendly don't expect a driver right away.
  • Well, for a price-conciencious envornment, you can still pick up some old 2nd geration cards (such as the Voodoo 2's and 3's) for $20-80. They still run Quake (even Q2 with some tweaking) just fine, and for general computing they put out a good resolution at a good refresh rate. I picked up an old PCI V3 2000 for $40 at a computer show (which is more than i needed for a little server), and you can probably still get them at that price (if not lower) in quantity (even though 3dfx is out of buisness, there is no shortage of them).

    --
  • There is a serious problem with the memory bandwidth of current cards, but embedded memory promises to alleviate this situation.

    I don't think it's a hard wall by any means.
  • er, sorry, "won't release specs" not "want.."
  • As we all know, lack of competition always leaves the consumers at a disadvantage. While this card won't be a hit among the Geforce3 target group, it could seriously cut into nVidias market, along with the Radeon. And while tile rendering has some strengths and some weaknesses, who is to say who'll run into the biggest problems... I doubt RAM, even DDR SDRAM will go all that much faster, so if they could create a tile rendering chip that needed the current bandwidth, it could really be something.. Might Kyro be to Geforce what AMD is to Intel? Time will tell..
  • Blending on-chip is still overdraw, even if it is faster because it's on-chip. The transparent polygons also need to do all the texture lookups and lighting computations to generate the fragments that are blended into the framebuffer, so you're still hitting memory multiple times for the transparent layers. Cards that have less fill-rate bandwidth are going to do worse on scenes with more depth complexity.

    The PowerVR2 chips empirically choke on large transparent textures (House of the Dead 2 on the Naomi arcade hardware, which is PowerVR2-based, is a good example), so you can draw your own conclusions as to whether or not they implemented that optimization.

  • Very nice. I was dreading the cost of the Geforce3, but kind of resigned myself to buying it.

    This sort of thing could really scare nVidia if it takes off; it'd be interesting to see if they come out with a Geforce3 Lite, or something, in order to compete with it.

  • I've been using an old G200 for QuakeIII and Half-Life. 45fps with little tweaking--and on a PII-350!
  • It's not as simple as that. You can have partially overlapped polygons amongst other things. Totally occluded polys can be culled without overdraw- partially occluded ones need some sort of clipping/culling done in one way or another to render right (Or you end up with gaps in the objects, etc.). Usually what is applied is a "painter's algorithm" which determines which order in space the polys are and paints them in order on the screen. That translates into overdraw. Some engines strive to minimize overdraw (such as Quake III) and others (such as Serious Sam) don't, letting the card deal with the problem. This is why you see such a disparity with Serious Sam logging such high scores for the Kyro II and the Kyro weighing in as a mid-range card- Croteam's apparently not concerning themselves as much about partially occluded polys and as such the Kyro's not rendering all the excess, non-visible info to the display memory like the GeForce and Radeon do.

    Jury's still out on this design, but it looks promising to say the least. There's several developers trying to sweet-talk STMicroelectronics or Imagination out of register info to make Linux drivers right now because of the potential of the cards.
  • Actually, Videologic (I believe they were independant at this time. NEC invested in them first, then bought them outright) was going to implement a multi-way architecture using these things, but due to the DreamCast thing, never really made. As I remember it, the article said that it would have been VERY easy to do multiple chips for this, and they chastised 3DFx for taking such an inelegant approach when designing their own multichip solution (Voodoo2)
  • Is is just me, or does the name Hercules bring you back to the days of CGA?

    Have they really been making cards under that brand all this time?
  • yeah, a Dual Head G400 ... nasty stuff. Absolutely *horrible* OpenGL implementation, at 1600x1200x32 it was running 4fps. I promptly returned the card to the OEM that was offering it to my employer as a demo and declined. There was just no reason to put up with that... DirectX was *lightening* at the time, I mean, they put some REALLY GOOD work into the DirectX implementation, but the OpenGL stuff was just *vastly* inferior.

    ---
  • Actually, NVIDIA's cards are significantly faster at only $40 or so price margin. The price/performance should actually be better for a 64MB GTS than a 64MB Radeon, but it depends on the benchmark and resolution.
  • Let's see 17million transistors. 4 of them would be around 68million transistors. Given that a GeForce 3 has 57 million transistors, and there would probably be some overlap between the chips, it seems that this would be quite doable. Of course, at that point you'd probably need DDR memery to feed the 4 chips, but hey, RAM is cheap! (As in 1 GB (4DIMMS) for $180 at pricewatch!)
  • "...worlds beyond anything in its price range..."

    Not necessarily... The Kyro II retails for $149.99; I just bought an eVGA GeForce2 GTS Pro from a local wholesaler for under $170 and I've seen Radeon DDR cards for as low as $130 locally.

    The Radeon performs slightly worse than the Kyro II, the GF2-Pro slightly better. Thus, I'd say that the Kyro II is right in line with other cards in its price range.

  • 3dfx (not the 3DFX we loved, but 3dfx) never used tile-rendering. They didn't support T&L because they were an opportunistic crappy company that screwed its users, tried to maintain a monopoly on Glide games (and used tricks such that if MS had used them, /. would be collectively frothing at the mouth) and deserved the pissing on that it got from NVIDIA. Ever since the Voodoo2 didn't make the jump to 32bit color (and the TNT did) 3dfx was on the way down for not being a leader in technology.
  • I had the sinking sensation that it's simply doing things the way old software renderers used to do it (especially the good old demos by Future Crew and friends).

    I had the exact same feeling while reading through AnandTech's write-up.
    And seeing such a huge performance difference between Q3 and Serious Sam, I wondered if it didn't come from the fact that Serious Sam's 3D engine "wasts" more bandwidth. Which would again explain the huge difference in the Fill Rate measured with Serious Sam.
    While Quake3's engine would be more effective, sending less hidden polys to the card.

    I remembered the days I fooled around making a couple of Q2 levels that, that a well designed level (mostly iD's) were very optimised in a bsp-tree kind of way (I sure hope what I'm saying makes any sens, because I'm really far from a 3D guru).

    I for one would sure LOVE to here from John Carmack's point of view on such a technic, as he is probably the most thourough graphic card analyst I've ever read. And his points are from the other side of the fence, on the cosumer side.

    Murphy(c)
  • How did this post get graded so highly? It is so full of mistakes!

    Saying that this is just a "4 year old architecture" simply because PowerVR has been implementing tile based rendering for some number of years would be like saying that the Geforce3 is nothing more than an overclocked TNT!
    The Kyro's (i.e. series 3 PowerVR chips) contain many new features, and so can't be considered to be "sped up" versions of their parents.

    Simon
    insert standard employee disclainer
  • Hmm, let's just look at the benchmarks, shall we? Lines in bold, with ***'s on them are the ones that the Kyro II came top in. In all other benchmarks, the winner was the GeForce2 Ultra.

    Quake III Arena Performance
    'Normal' Settings - 640x480x32
    'Normal' Settings - 1024x768x32
    'Normal' Settings - 1600x1200x32

    MDK2 Performance
    Default Settings (T&L enabled) - 640x480x32
    Default Settings (T&L enabled) - 1024x768x32
    Default Settings (T&L enabled) - 1600x1200x32

    UnrealTournament Performance
    Minimum Frame Rate - 640x480x32 ***
    Average Frame Rate - 640x480x32 ***
    Minimum Frame Rate - 1024x768x32
    Average Frame Rate - 1024x768x32
    Minimum Frame Rate - 1600x1200x16
    Average Frame Rate - 1600x1200x16

    Serious Sam Performance - Fill Rates
    Serious Sam Test 2 Single Texture Fillrate
    Serious Sam Test 2 Multitexture Fillrate

    Serious Sam Performance - Game Play
    Serious Sam Test 2 640x480x32
    Serious Sam Test 2 1024x768x32 ***
    Serious Sam Test 2 1600x1200x32 ***

    Mercedes-Benz Truck Racing Performance
    All options enabled - 640x480x32
    All options enabled - 1024x768x32
    All options enabled - 1600x1200x32

    FSAA Image Quality and Performance
    Serious Sam Test 2 640x480x32 (4 Sample FSAA) ***
    Serious Sam Test 2 1024x768x32 (4 Sample FSAA) ***

    You can draw your own conclusions, but I think I'll keep saving for that GeForce.
  • "Unique" presumably means that no-one other than the PowerVR series, which this new card, the Dreamcast's chip and the original PowerVR card all belong to, are the only ones to go down the tile-based route.

    This is because its a phenomenally quick render method when designed for, but

    (a) it takes a big hit to do stuff the way every other 3d card on the market does things (and guess which method is going to get used by a developer writing for a platform where either might be in place), and

    (b) if you are used to doing things the 'normal' way its a pain in the rear to try and re-jig your code into a tile-based format. You might as well rewrite the engine from the ground up.

    Of course, if (as with the Dreamcast) you're writing explicitly for a tile-based platform then it kicks arse for the money.
  • Hercules' [hercules.com] assets and brand have been bought in by French PC devices maker Guillemot [guillemot.com], who recently "reactivated" the brand.
  • Several of the GeForce 3 features require games to be rewritten to take full advantage of it - particularly the programmable parts, I'd assume.

    And if you'd read the article, you'd see that this card does achives FSAA at a decent resolution with very good performance, and that the quality of the memory architecture is what really makes it compare well, by massively reducing the amount of memory accesses.

  • If you'd read the article you'd have seen that they are releasing a lower power version based on the same architecture as well, and suggested a price of around $79 for it. But if you're looking for cheap machines for office applications, you should be looking at something with integrated chipsets instead. It's not like you'd normally put a 3D accelerated graphics card in a machine that is only intended for word processing or similar.
  • You're at least partly wrong. You have to draw the pixel several times, but only to the on-die tile cache, before writing it out once over the external bus.

    So you should still see a significant benefit - not as much as for opaque areas, though, as it can't just throw away the partially obscured pixels as it can with the totally hidden ones.

  • Indeed - times, they are a-changin'. Or something.

    --

  • This is the Vooodoo 3 2000, right? I thought the Voodoo what highly dependant on the CPU speed. This will be going into an old PPro 200. I will look into it, thanks.

  • by BadBlood ( 134525 ) on Tuesday March 13, 2001 @11:11AM (#366060)
    ..I will wait for the obligatory Mr. Carmack response modded to +5. I'm hoping he's busy writing it now :)

  • On the flip side of this, could the tile-based rendering be implemented for the very lowest segment of the video card market: PCI cards for legacy desktops? Wouldn't the tile-based rendering at least partially minimize the performance hit from using PCI as opposed to AGP.

    I'd like to find an inexpensive PCI card to replace the 2MB Mystique in my old PPro200... I guess their wouldn't be much of a profit margin, however.

  • by Anonymous Coward

    This design is very similar (if not the same) as the NEC's PowerVR and PowerVR2 chipsets.

    Kyro IS a PowerVR chip. Read before you comment.
  • by taniwha ( 70410 ) on Tuesday March 13, 2001 @11:20AM (#366063) Homepage Journal
    the chip is actually achieving its theoretical fillrate. This has never happened before in the entire history of the graphics chip industry except perhaps in their previous chips.

    Nah - people have designed graphics chips that hit 'perfect' fill rates before - I know I did one (for the Mac 7-8 years back) that hit 1.2Gb/sec into VRAM (then state of the art DRAM) exactly as it was designed to.

    Graphics chips have a relativly long history that is at least in part driven by the comodity memory technologies they have available to them. These days we're particularly troubled - system costs are going down, DRAM speeds haven't kept pace with CPU/GPU speed increases (CPUs have maybe gone from 100MHz to 1GHz in the time that memory has gone from 66MHz to 266MHz [transfer rate - latencies have only halved]).

    'Tricks' like ISS (aka tiled frame buffers) work because they basicly cache the problem - at the expense of keeping an ordered polygon list (which means that you are more sensitive to scene complexity - too many more polys than pixels and you might be in big trouble) and latency (because you can't finish the poly sort stage before you start rendering - so you have to render a complete screen at once - while maybe buffering the next scene's polys in parallel) - note I'm over simplifying the problems here to explain some of the issues - there's lots of scope for smart people to do smart things in a space like this (before all the patents are granted - then without competition inovation will probably cease :-( )

  • Since tile based rendering eliminates overdraw, the effective fill rate of a tile based renderer can actually surpass the effective fill rate.

    Wow! They can make the effective fill rate surpass the effective fill rate?! Maybe they can make my bank account balance surpass my bank account balance!

  • hehe, dude...slow down a bit. Before they get X drivers going, they first needto figure out all the bugs for windows drivers. I read a review over at Toms about a tile based renderer, and they tend to have issues drawing some things due to programming style, etc. (someone said something about zbuffer, etc)
  • I would LOVE to see this with T&L and Highbandwidth memory. If they can do well with these and fund further development to get a DDR version with T&L we might have some competition for the GF3 next year.

    Of course The Carmack has spoken and does not agree with Tile Based rendering right now, at it's core it is kind of a kludge.. hrm..
    I wonder what he thinks of that Anandtech article.

    Oh great and powerful Carmack, we ask that you can grace us with your knowledge and wisdom in this time of confusion and shed light on the validity of tile based rendering. Hear us!
  • by jovlinger ( 55075 ) on Tuesday March 13, 2001 @11:34AM (#366067) Homepage

    This is an instance of the old ATM vs IP or CISC vs RISC debates. It's the old engineering tradeoff: work smart but slow or work quick and dirty. Tile based rending is an instance of smart and slow, ie they do no more work than they have to, and thus get away with slower clocks and memory. The NVIDIA card is quick and dirty.

    Historically, it is almost always the case that quick and dirty is the cheaper way to go, as it allows economies of scale to come into play. However, it is seeming more and more like the memory bandwidth bottleneck is here to stay, so the smart and slow approach is looking pretty good. Likewise as we run into physical limitations for network bandwidth, IP is going to have a harder and harder time to provide acceptable QoS and multicast solutions and ATM-like technologies will start becoming more prevalent.
  • Does anyone actually still buy complete PC's?

    I mean you get ripped off, with non upgradeable junk unless you build it yourself.

    And unless you build it yourself, when it breaks you usually have to take it somewhere to fix it.

    Build it yourself, buy the cards, mb, cpu, ram, hd. and enjoy.
  • Two things:

    The benchmarks show 350M pixels/s rendered on a 175MHz chip with two pipelines. I don't think anyone in the PC graphics industry has ever accomplished that. (I believe the VooDoo and other really early cards were held back by time to set up all the polys on the CPU)

    Second, the point stands that this is quite new to the scene and that more bandwidth won't help.

    BTW, thanks for the info.

  • Er, they were making cards up until the time of the TNT2 ultra. Then, for some reason the company went belly up, and Guillemot bought the rights to the name.

    I remember seeing ads for high-end Hercules boards in CAD magazines in the mid-90s, also.

    zsazs
  • masked jackal
  • Try a V3-2000. SHould be dirt cheap these days, and the PCI version is as fast (er, slow?) as the AGP version...

    (I get a solid 60fps in UT, on a Duron 750 machine)

    --
  • by Anonymous Coward
    Your forgetting that this card does per-pixel sorting so all your alpha effects will work correctly, which is non trivial to achieve on traditional architectures. It's much more fun to work with alpha blending on a PowerVR than on a 3Dfx/nVidia/Matrox/PS2. But some multipass effects are harder to achieve since you do not have full control on the order in which your triangles are rendered, remember multi-pass != alpha blending.
  • This is an impressive card, no matter how you look at it. It makes me wonder, what would occur if this came into play in the laptop/integrated market. The card is obviously cheap, when you look at the Geforce2go which is basically a Geforce 2MX with a lower power consumption, it is in the same price bracket. It also only dissipates 4 watts. For a card this powerful, cheap and cool, why isn't anyone thinking of these markets to push the chip?
  • fuckhead, it's 60 70 80 KILOhertz, how many thousands of times a line is drawn.
    Um, vertical refresh rates are usually from 60-100 Hertz. That means the screen is redrawn around 80 times a second. The guy was saying there's no point in having higher framerates if the monitor can only draw 80% of them (which is why you turn vsync off for gaming). Of course there are other issues involving fewer dropped frames in graphically intensive maps, multiplayer gaming , etc., that make these high framerates desirable.
  • You've got to be shitting me. ATI? They are the lamest, nonexistant driver, non-3d 3d card manufacturer. I just wasted a weekend trying to get Half-life running under W2k. Their OpenGL implementation is so poor the game performs best in SOFTWARE RENDERING.
  • An ex-collegue of mine left to go work in ST Micro's drivers department about a year ago. I looked him up not long after when the original Kyro was released; at the time, he was doing driver development for the KyroII. He mentioned then that ST Micro were working on a Kyro-with-T&L part, but didn't mention any ship dates. I did get the impression that is wasn't going to be that far behind the KyroII though. So, we might have a T&L enabled card sooner rather than later, which will be pretty sweet. In the meantime, I think the KyroII will be the perfect stopgap between my Geforce 1 DDR, which is starting to look a little long in the tooth, and a Geforce 3, which will be an almighty card once DirectX 8 has some proper software support.
  • According to the anandtech benchmarks:

    QIII Arena 1024x768 @32bpp
    GeForce2 GTS 64MB: 95.6fps
    Radeon DDR 64MB: 80.6fps
    That's a quite significant 15fps.

    Q3 at 16x12 is unplayable on everything except the Ultra, but the GTS2 still wins.

    MDK 1024x768 @32bpp
    GeForce2 GTS 64MB: 105.9fps
    Radeon DDR 64MB: 86.8fps
    Again, about 18 more fps at this res.

    MDK 1600x1200 @32bpp
    GeForce2 GTS 64MB 43.3fps
    Radeon DDR 64MB: 38.2fps
    Only 5fps faster, but that's around 12% faster.

    Unreal Tournament 1024x768 @32bpp (avg)
    GeForce2 GTS 64MB: 84.5fps
    Radeon DDR 64MB: 87.8fps.
    Here the DDR wins, but only by 3fps.

    Unreal Tournament 1600x1200 @32bpp (min)
    GeForce2 GTS 64MB: 34.3fps
    Radeon DDR 64MB: 18.8fps
    Ouch. What were you saying about high resolutions?
    The GTS is playable, the Radeon is not.

    Unreal Tournament 1600x1200 @32bpp (avg)
    GeForce2 GTS 64MB: 68.9fps
    Radeon DDR 64MB: 56.9fps
    The GTS is 12fps faster here.

    Serious Sam 1024x768 @32bpp
    GeForce2 GTS 64MB: 47.2fps
    Radeon DDR 64MB: 50.1fps
    The Radeon wins, but its only 3fps faster.

    Serious Sam 1600x1200 @32bpp
    GeForce2 GTS 64MB: 22.5fps
    Radeon DDR 64MB: 24.7fps
    A hair over 2fps faster.

    Mercedes-Benz 1600x1200 @32bpp
    GeForce2 GTS 64MB: 20.9fps
    Radeon DDR 64MB: 24.2fps
    The only decisive victory for the Radeon. Still, at the only playable resolution (640x480) the GTS wins 64.7 to 57.8.

    So overall, the Radeon is a good card, but NVIDIA still has a significant speed advantage, and for only a little bit more, is worth it, in my opinion. (Not the mention the fact that they have better drivers and pro-caliber OpenGL!)
  • by Webmonger ( 24302 ) on Tuesday March 13, 2001 @06:28PM (#366079) Homepage
    Interesting. See the article kept saying "It's great value for the price--sometimes it even beats a GF Ultra". No one said it was superior to the most expensive consumer 3d hardware. . .
  • by SpanishInquisition ( 127269 ) on Tuesday March 13, 2001 @10:23AM (#366080) Homepage Journal
    Space invaders *SO* fast on this card, like 23000 FPS
    --
  • I'm looking forward to a version of this card with T&L on it. It managed to keep up in most tests...except the ones where T&L was actually used. Anything used in my DC is good in my comp too :)
  • Can you say Dreamcast? I knew that you could.
  • This is PowerVR's third generation. PowerVR's second-generation chip with a similar architecture was in the Dreamcast (and a few videocards in Europe, I think). Their tile-based system was actually very clever and functional on the Dreamcast and the chip in general was reasonable to work with despite its special features. The very efficient PowerVR 2 is still holding its own against many first-generation PS2 games. The Dreamcast in general was a very easy-to-work-with system.

    By the way, did you know you can use the Dreamcast Broadband Adapter to connect to your PC for some do-it-yourself development [julesdcdev.com]? Very cool...

  • There clock is much much faster then any clock I've seen today.
  • by evanbd ( 210358 ) on Tuesday March 13, 2001 @10:31AM (#366085)
    If the poster had read the benchmarks, it would be obvious that the case is not so cut and dry. The card wins at some things, loses at others. It loses to the GF2GTS in some benchmarks, and beats the GF2 Ultra in others. A very cool card, and worlds beyond anything in its price range, however. This should do very good things to the low price range performance market as a whole, by pushing down other prices and by providing a cool new technology.
  • I realize that the Kyro offers a very good price/performance ratio, but why don't they offer a model (for a higher price, obviously) that had higher memory clocks? This way, those who wanted to pay for more performance could do so, and they could continue to sell their current cards at their competitive price.

    Isnt't this why the GeForce 2 Ultras even exist? Some people always want the fastest cards, and are willing to pay premiums to be on the bleeding edge... my guess is that the "bleeding edgers" will reap a higher percentage profit on each unit...
  • This actually sounds pretty damn cool, and with a little luck will provide some nice compatition for nVidia. Since 3Dfx went bye-bye, I have been a little worried that nVidia would be the only real gaming card supplier(well, I guess that depends on if you count ATi)

  • My next upgrade will be the video card. I've been intersted in AA as soon as I heard it was available on a video card. If you check out the article, this new card has better AA performance than the Geforce 2 Ultra.

    Very intersting.

    Good thing I have to wait a few months anyways.

    Later
    ErikZ
  • why don't they offer a model (for a higher price, obviously) that had higher memory clocks?

    Probably, because with a first product release, they want to enter a space that they could dominate (much lower price, much better performace) than one that they would have less of a p/p ratio. The kind of gamers that spend for a top of the line video card will just stick to a brand for "loyality" sake, or will skew benchmarks to make their choice look better.

    This kind of thing goes on less in the middle range, IMO.

    Of course, this assumes that this card delivers, and has not skewed its own benchmark too far.

    --
    Evan

  • by evanbd ( 210358 ) on Tuesday March 13, 2001 @10:35AM (#366090)
    the answer is very simple: the chip doesn't need it. Read the article, look at the later benchmarks -- the chip is actually achieving its theoretical fillrate. This has never happened before in the entire history of the graphics chip industry except perhaps in their previous chips. This is amazingly new. If they gave it more memory, guess what -- the numbers would be the same. The whole point is the chip is so good at what it does it doesn't need the bandwidth. Now, if they went to four pipelines and a DDR interface, that would be cool. But, the tileing architecture may not be that fast.
  • by Mumbly_Joe ( 302720 ) <krolco@hotm a i l . com> on Tuesday March 13, 2001 @10:40AM (#366091)
    Tile-based rendering only outperforms other types of rendering on certain types of tests.

    Tile-based rendering's big benefit it that is reduces overdraw to 0; that is, each opaque pixel on the screen is drawn exactly once. Performance for certain types of scenes is spectacular.

    Dreamcast uses this, as well as many of Sega's arcade systems (HOTD2, for instance), which use the same PowerVR2 rendering system.

    Where tile-based rendering falls down, however, is for scenes that contain a large amount of alpha-blended areas. Alpha-blended areas in today's hardware are necessarily drawn multiple times, from back-to-front, to accomplish transparency effects. Having to draw the pixel several times nullifies the zero-overdraw benefit of tile rendering. Since most tile-rendering systems trade fill-rate for zero overdraw, cards with insufficient fill rate for large alpha areas (read: all of them) fall down on large, alpha blended polygons. You can see this in House of the Dead 2 when fighting the Hierophant; if you get enough water splash effects on the screen, the frame rate chokes.

    Tile rendering works extremely well for areas that are opaque, or use only small alpha-blended areas. It's getting better; it's just not perfect yet.

    Mumbly Joe

  • .. find ethernet support in the latest CVS snapshot of the LinuxDC [linuxdc.org] kernel. I'm writing up docs now (for NFS mounting and initrd), and I'll post to the site as soon as they're in CVS.

    I know this is a shameless plug, but I spent all weekend working on ethernet, and I sent my friends a couple of e-mails via a telnet session (under a BusyBox filled initrd) from my Dreamcast :). But seriously, we need more kernel hackers in there so we can spit out more drivers....

    Back on topic, the LinuxDC framebuffer writes from CPU RAM directly to PVR2 RAM, which is about as slow as you can get. I ran a simple SDL parallax scrolling example, and the results, were shall we say CRAP :). I've started thinking about how to accelerate the FB using the PVR2's Tile Accelerator, but I'm not that keen with its internals or how Tile-based redering would work (yet). If anyone there can point to some TA-based resources in general - there are a few good docs linked from julesdcdev, but I was thinking more general TA docs (e.g. not Dreamcast-specific).

    We *need* interested developers, testers, and authors, to stop by LinuxDC (we're also in the process of restructuring our site), as we're finally starting to get the ball rolling...

    M. R.

  • Not for the serious gamer perhaps, but this is just another card that is completely overpowered, and therefore overpriced, for development and office purproses.

    Personally, I would like to see an emphasis on increasing any given video adapter's efficiency and decreasing its price before increasing its power.
  • This is NOT true at all actually.

    While it may be that the PowerVR2 did not implement it correctly, there is nothing that prevents performance much better than immediate mode style rasterizers. Consider it this way:

    A game needs to draw 5 opaque polygons, with 3 alpha polygons on top.

    An immediate mode rasterizer would have to write all five polygons to memory, including all of the associated texture lookups and lighting calculations. Then, for each alpha polygon, it would have to reread bits from the framebuffer and combine it with the shaded textured alpha polygon. This is a lot of memory traffic.

    A tile based renderer, otoh, would not need to do all of this. Obviously it would be able to eliminate all of the overdraw on the opaque polygons, but it would also be able to do the blending in the ON CHIP 24bit tile framebuffer, which is much much much faster than going to off chip memory. This means that instead of having to do read-modify-write off chip memory cycles for each of those alpha blended polygons, it stays on chip.

    Now like I said before, I am not familiar with the PowerVR2 chip, and it may be that they do not implement this obvious optimization... I would assume their newer chip would.

    My big question is "why not a T&L unit?" It seems like a sever handicap to an otherwise stellar chip. Although somewhat addressed in the article, they didn't really justify it well, and the benchmarks prove it would be handy. Maybe the 175mhz clock is what prevents an effective T&L unit from being added...

    -Chris [nondot.org]

  • It seems to me that this is an entry level product that won't blow the industry leader out of the water, but a product designed to prove that the company is good/worthy of venture capital/a place other companies' top designers should leave to work at. It reminds me a bit of nVidia's riva 128 four or five years back; not the best thing out there (3dfx) but established nVidia as the second best video card maker, garnering support and helping it get as big as it is today.*

    All in all, this is bad news for ATI. They're losing their OEM business to nVidia not only in low cost PC's but in Macs as well. They decided to reinvent themselves with the Radeon's swank environmental bump-mapping and stuff, a high-end 2d card for graphic designers who fired up Quake on the office LAN after hours. This would (they hoped) put them in the #2 spot and help ATI move into the 3d gamer market. But looking at the benchmarks for the Kyro II, the new chip beats the DDR Radeon in several benchmarks, impressive considering the newcomer's lack of T&L rendering. Unless the Kyro has horrible image quality, I would guess ATI is not pleased.

    * I realize that Power VR et al have been around for years making chips for consoles and arcade games. So was nVidia before the riva 128; I'm talking about entry into the PC graphics card market.

  • I understood that the goal of tile based rendering would be that the tiles would be able to be devided between multiple cpu's so the tiles could be rendered in parallel. OR is this just the future of tile based rendering? Graphics chip designers really have an advantage over cpu's, they can easilly provide enough registers on thier gpu's as well as very small instruction sets. Lucky bastards.
  • Does Matrox make *anything* for someone that is more than just a casual gamer?
    Same for ATI. I've got the impression that both of those manufacturers made cards which are more suited for their A/V capabilities than their 3d acceleration capabilities..


    -since when did 'MTV' stand for Real World Television instead of MUSIC television?
  • I've /had/ a voodoo2 and a voodoo3. They will run *much* more than just quake1 or quake2 (no tweaking required..)
    In fact, I used my voodoo3 up until a month ago when I bought a new geforce2 gts. Yeah, I get double (or more) fps.. but it's almost overkill for what I'm using it for at the moment.. Frontline Force :P

    But seriously.. if it's a fps, and it came out sometime in the past 4 years.. I've ran it on a voodoo2 or a 3 (2k).

    -since when did 'MTV' stand for Real World Television instead of MUSIC television?
  • The ATI Radeon is a serious gamers card, it's not quite up to par with the highend nVidia stuff, but it doesn't come with the highend nVidia pricetag either.
  • ... occulted triangles...

    ... releasing the bandwitch...


    So what are you saying? Tile-based rendering is the work of Satan?
  • Maybe a little hyped ; I thought the article seemed a little bit optimistic about the card, myself. However, I also thought about it this way: The KyroII may win some and lose some....but so does the GeForce Ultra. From this perspective, it's not so obvious that the GeForce 2 Ultra is the ultimate 3D accelerator. It actually gets beaten in more than a couple of these tests! However, I will have to see it to believe it myself. How about image quality? Availability? Most importantly, is it glitchy with Counter-Strike and HalfLife?
  • This article is mirrored on the freenet. [freenetproject.org]

    Check out

    freenet:CHK@qANifG8baVSFWd-ZsW5kvFVjcwcOAwE,ZXRUsp PkxMFRzwRsJdrpqg

  • My understanding about the GeForce 3 is that any old game can take advantage of it out of the box., for example, actually being able to use FSAA at a decent resolution, and of course, faster frame rates through the use of massive quantities of transistors and a more eficient memory architecture.
  • Will this render porn more clearly?


    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ~~ the real world is much simpler ~~
  • Traditionally, it has been more prosessing-expensive to work out if a polygon is obscured than it has been draw it, and then get on with the others.

    --

  • "PowerVR's second-generation chip with a similar architecture was in the Dreamcast (and a few videocards in Europe, I think)."

    Apocalype 5D by Videologic. Nice card for it's time, however. There was ZERO support of the 3D part of the card under linux. The windows driver hasn't been update in over two years. What good is the "better" solution if you can't use it, or use it effectively?

    My GeForce at least has drivers under Linux and updated drivers under windows. That's more than NEC and Videologic can ever claim.

    And yes I'm bitter...
  • Interesting. See the article said "It is so efficient in fact, that the $149 Kyro II card clocked at 175MHz is able to outperform a GeForce2 Ultra with considerably more power and around 3X the cost of the Kyro II card."

    I don't call slightly faster results in a distinct minority of the benchmarks vs. much slower results in the rest 'outperforming'.
  • by Anonymous Coward
    Matrox cards are excellent at the oft ignored 2D - better than the latest nVidia blah ATI ner and Kyro guff respectively. One design aspect of Matrox is to do all the scene blends just before they're sent to your screen (unlike nVidia who realise this card structure is more expensive and most people won't notice the colour loss). A friend of mine who sits all day doing CAD (I'm not sure whether it's his actual job) swears by Matrox cards and says that other cards look dull by comprason. It's one of those odd forgetable claims but I've heard the same from several people. Matrox cards are an excellent work machine card for general desktop use.

    Although 3D - nah, they suck.

  • There is such thing a quad data rate SDRAM, the memory clock is at an offset to the processor clock or something like that. I am not sure where I read it, it might have been here.
  • According to the article, the 32bpp image quality is good, and the 16bpp image quality is better than most. So I would guess that ATI is definitely not pleased. And for that matter, neither should nVidia be. Hercules WAS an exclusively nVidia shop. No longer, and who knows how many converts the Kyro II just won with the glowing review (and unreal benchmarks). I know that I'm certainly no longer going to be considering an MX...
  • That's actually one of the big benefits of the PowerVR architecture. I believe there are some Sega arcade boards that use two or more PowerVR chips.
  • by Cryptnotic ( 154382 ) on Tuesday March 13, 2001 @10:42AM (#366112)
    This design is very similar (if not the same) as the NEC's PowerVR and PowerVR2 chipsets.

    Here's how it works:

    • Instead of using regular SDR or DDR RAM for a Z-buffer of the entire screen, it uses a very high-speed on-chip "tile" Z-buffer, usually 32x32 pixels or so.
    • To render a frame, the system breaks the frame up into "tiles" of 32x32.
    • Each "tile" is rendered using its own clipping volume and camera matrix, etc.

    Anyway, because the system uses ZERO memory bandwidth for Z-buffer calculations, the system is far more efficient, even though it is essentially traversing the scene dozens of times for each frame.

    This is why the Sega Dreamcast is often able to have better performance than the Playstation 2.

    Cryptnotic

  • by Olivier Galibert ( 774 ) on Tuesday March 13, 2001 @10:50AM (#366113)
    The simple idea behind tile-based rendering is to divide the screen into square patches (8x8 or 16x16 usually) and, for each patch, find which of the triangles intersect the patch, do a quick depth sort to detect complete occlusions, and draw.

    There is a good article on it, as applied to the powervr (which is using the same kind of architecture) at http://www.ping.be/powervr/PVRSGRendMain.htm [www.ping.be]. As others already said, you can see the results on the Dreamcast, or on the arcade version, the Naomi.

    The strenghts are obvious:

    • Lower fillrate required because of the per-petch occulted triangles elimination
    • The currently-rendered tile memory can be on-die, L1-grade, releasing the bandwitch for texture reading

    The weaknesses are a little less obvious:

    • Rendering start delayed because it requires having all the triangles available. Can be somewhat hidden by multi-buffering
    • Alpha-blending slows things down hard, because it increases the required fillrate very fast, and these cards are designed with a lower fillrate in mind
    • There is no Z-buffer anymore (at least at peak speed, it's not copied back to the main memory), and we know that the 3D programmers love to do tricks with the Z-buffer

    As a result, these cards are nice, but mostly represent another set of tradeoffs, not necessarily a revolution.

    OG.

  • Sounds like a neat low-end solution, but I'm always suspicious when the evangelists have to spread FUD like:

    "Also included in the Kyro II is 8-layer multisampling that allows for up to 8 textures to be applied in a single pass. Other cards are forced to re-send triangle data for the scene being rendered when multitexturing, eating up precious memory bandwidth. Since the Kyro II features 8-layer multisampling, the chip can process the textures without having to re-send the triangle information."

    Guys, if the chip is all that, let it stand out on its virtues alone. Your competition has been multitexturing since the Voodoo II.

    And of course:

    "Missing from the Kyro II feature set is a T&L engine. Claiming that the current generation of CPUs are far superior at T&L calculations than any graphics part can be, STMicroelectronics choose to leave T&L off the Kyro II."

    I could sneeze at this point and mutter the appropriate profanity under my breath. However, I'd much rather see the chip succeed or fail because of its feature set, instead of the ability of Imagination/STMicroelectronics at slinging mud at the competition.

    Those benchmarks are really interesting. It would be fantastic to have a successor to 3Dfx, if only to keep Nvidia and ATI on their toes. My chief worry towards their commercial acceptance would be how much of DirectX 8 do these guys support? It's not a fair worry, but I think it's a realistic one. I wish them the best of luck.

  • If I remember correctly GigaPixel's architecture was also Tile based, and I believe they had spent quite some time trying to head off the known issues with Tile architectures (though I honestly don't know how successful they were - the demos I saw were a while ago and looked good but things have changed since then).

    Of course GigaPixel was acquired by 3dfx for approx. 300 Million US$ after initially winning the XBox graphics contract and then having it pulled from beneath them. And of course 3dfx was in turn acquired (though for only 150-160 Million US$ ?) by nVidia. So if Tile based rendering has a future (and Gigapixels is good) perhaps we can expect to see it from nVidia too before long.
  • *looks out the window*

    Hmm... if the OpenGL support was a little bit better, I might be able to discern that it was actually Matrox out there, instead of the 4fps which made me miss the billboard as I rode by on my snail.

    Note to moderators: Have you actually checked on the OpenGL driver from Matrox to see its performance? It really _IS_ that bad. Go ahead. Mod me down, its still the truth.



    ---
  • Erm.. the whole point of this is that it doesnt *need* DDR. Adding DDR to it would *not* increase its performance what-so-ever.

    That said, adding four of these inline and jumping to DDR would be decidedly sweet. The chips are fairly small, which would facilitate this, but I'm not sure if they are capable of that... since they just work on tiles, I cant see why you couldnt assign each a section of the scene but ... who knows.

    It will be quite a while before hardware T&L comes out on these, I think, considering that this iteration is only just being released.

    ---
  • by composer777 ( 175489 ) on Tuesday March 13, 2001 @10:58AM (#366118)
    If you want to find out what is amazing about this card, read on: This card is based on NEC's powerVR architecture, and is really nothing more than the PowerVR2 clocked up to 175 mhz. What's funny is, I remember getting excited about this card over 3 years ago!! If you want to do more research on the architecture, dig up some old articles on Tom's hardware, where he benches it with quake1. At the time, the card was supposed to clean up the market, and it was going to debut at 125 mhz core/memory speed. (This was at the time when the voodoo1 was the standard, and the voodoo2 had just entered the scene, I remember holding out for this card, and simply settled on a TNT when I found out that NEC decided to drop out of the PC market). Then NEC made a deal with Sega, and put the chip in the dreamcast. What's even more amazing about the chip, is that ST simply had to change the clock to 175 mhz to make it competitive with nvidia's gefore2 ultra. What I think will be scary, is when they revamp this 4 year old chip design, and add T & L. Imagine what a chip like this could do with DDR RAM instead of SDRAM. This current chip only supports SDRAM, which is why they didn't put DDR RAM on the card. I think nvidia has their work cut out for them. Hopefully they will be able to license tile based rendering for their next card. I was really hoping that they would put it in the geforce 3, it would have made quite a bit greater difference than a crossbar memory architcture.
  • Is anyone actually still buying PC cards?
  • You might have to wait a while... 3DFX never supported T&L because tile based rendering is increadibly inefficiant when the number of polygons increase.
  • This design is very similar (if not the same) as the NEC's PowerVR and PowerVR2 chipsets.

    That's because the Kyro/Kyro II use the PowerVR3 architecture. NEC used to partner with Imagination to produce those older chips.
    -----
    #o#

  • I read somewhere that DirectX-8 is going to further abstract the z-buffer out of the programmer's hands. The article from which I gleaned this tidbit was exceptionally poorly written and I've not backed this up with real reasearch. Perhaps this was a bit of forward-thinking on NEC's part?
  • While it may be that the PowerVR2 did not implement it correctly, there is nothing that prevents alpha blending performance much better than immediate mode style rasterizers. Consider it this way:

    A game needs to draw 5 opaque polygons, with 3 alpha polygons on top.

    An immediate mode rasterizer would have to write all five polygons to memory, including all of the associated texture lookups and lighting calculations. Then, for each alpha polygon, it would have to reread bits from the framebuffer and combine it with the shaded textured alpha polygon. This is a lot of memory traffic.

    A tile based renderer, otoh, would not need to do all of this. Obviously it would be able to eliminate all of the overdraw on the opaque polygons, but it would also be able to do the blending in the ON CHIP 24bit tile framebuffer, which is much much much faster than going to off chip memory. This means that instead of having to do read-modify-write off chip memory cycles for each of those alpha blended polygons, it stays on chip.

    Now like I said before, I am not familiar with the PowerVR2 chip, and it may be that they do not implement this obvious optimization... I would assume their newer chip would.

    My big question is "why not a T&L unit?" It seems like a sever handicap to an otherwise stellar chip. Although somewhat addressed in the article, they didn't really justify it well, and the benchmarks prove it would be handy. Maybe the 175mhz clock is what prevents an effective T&L unit from being added...

    -Chris

  • As I understand it, the FSAA on the Geforce 2's really sucked. As if they weren't designed with AA in mind at all. They use a method called super sampling, where the image is rendered at a higher resolution, and then downscaled. Its a pretty bad method, I have to turn it off on my GF2 GTS because I can't stand the drop in framerate. The Voodoo 5 and the GF3 are a lot better at antialiasing than the GF2's.
  • Performance hit from pci?? huh, go look at the numbers until recently , like the past 3-4 months there wasn't enough stuff being transferred that the pci bus couldn't handle, the agp slot is pretty much useless until we get bigger textures and stuff.
  • This actually sounds pretty damn cool, and with a little luck will provide some nice compatition for nVidia. Since 3Dfx went bye-bye, I have been a little worried that nVidia would be the only real gaming card supplier(well, I guess that depends on if you count ATi)

    And why would you not count ATI?
    -----
    #o#

  • Sorry, I completely forgot about it being shared. however if I can point out from my experience I've not noticed a diffence when running the vodoo 3000 pci card or agp card, I happened to have a 100 nic and a soundblaster live card on the pci bus too. But I spose the diffrence could be explained becuase on that system the bus was overclocked a bit.
  • It's always amusing how people will COMMENT on a story without reading it. Maybe if it was open-source, slashdotters would be more interested, I dunno. I don't know why slashdot doesn't link you directly to the article before letting you post a comment or read comments... i.e. you guys coul djust open a frame or make a local cache of the article perhaps? Regardless, this thing is damn exciting because you get huge BANG for the buck. Its the only video card where you'll get more than you paid for.
  • The article talks about the Windows drivers (complaining a bit about them - I assume they're still in development though). It does mention openGL support in the windows drivers...

    Does anyone know if there will be DRI support for this chipset any time soon? One of these days I'll have to upgrade from my old Voodoo Banshee card...


    ---
    "They have strategic air commands, nuclear submarines, and John Wayne. We have this"
  • It sounds like a nice card and yadda yadda. Are they going to supply X drivers with GLX support? Otherwise the card isn't worth buying.
  • It also sounds very scalable. Use 4, 8, or 16 of these together and you may have a very good high end system.

    Of course, in the mean time, no specs -> no free/open XFree drivers -> no sale for me

E = MC ** 2 +- 3db

Working...