Introduction
It’s now almost a month since NVIDIA finally lifted the curtain on the actual performance of the latest addition to their 3D-chips ‘GPU’ line, the GeForce3. We had covered the technical aspects of this new design in a rather extensive article, but several other projects kept us from releasing the performance update until now.
GeForce3 is NVIDIA’s latest contender for the position as the fastest mainstream 3D-chip on the planet and besides pure power it also comes with a big set of brand new features. Some of these features may not have any major impact on game play right now, but others are well able to enhance your 3D-experience with current 3D-games already. Here’s a short list of the most important new features:
- Vertex Shader
GeForce3 does not only come with an integrated ‘static’ transform and lighting engine, as known from previous GeForce256 and GeForce2 chips, but adds the functionality of a programmable vertex processor, called the ‘Vertex Shader’. Game developers are able to include amazing new effects to their titles by adding those vertex programs to their games. The list of new possibilities is very long. - Pixel Shader
To make perfect sense NVIDIA also added programmable texture combiners, making texture operations programmable as well. This functionality is able to work hand in hand with the vertex shader and adds even more to the portfolio of possible new effects that GeForce3 is able to display. - Light Speed Memory Architecture
GeForce3 is using a, for 3D-cards revolutionary, new memory controller concept, that tackles the permanent memory bandwidth and latency problem of today’s 3D-cards. The crossbar memory controller is able to use the bandwidth provided by the onboard memory of the graphics card much more efficiently than previous 3D-chip designs. It lets GeForce3 score significantly better frame rate scores than previous cards from NVIDIA that come with the same memory bandwidth and even higher theoretical fill rates. GeForce3 is also equipped with a design that tries to minimize the rendering of hidden objects (“Z Occlusion Culling“), thus adding to the chips efficiency. - Multi Sampling Full Scene Anti Aliasing
GeForce3 was designed with FSAA in mind and uses a new anti-aliasing algorithm called ‘Quincunx’. This special version of multi sampling anti aliasing enables GeForce3 to run FSAA at very high resolutions and good frame rates. Here are two sample pictures with quincunx-FSAA enabled and disabled (1.1 MB each).
If you require more details of GeForce3’s new features, please refer to my very detailed article “High-Tech and Vertex Juggling – GeForce3 GPU“.
There is no doubt that GeForce3 is the most advanced 3D-chip right now, but what we need to know is if all those nifty features are also performing well enough. How does GeForce3 stick up against previous high-end 3D-chips? After all GeForce3 doesn’t come cheap. If you want to equip your system with this high-tech design, you need to pay some hefty $400. Is GeForce3 worth that amount of money? Do all those new features really benefit the user right now?
Driver
On March 22 NVIDIA released the driver 11.01, as the first that would fully support GeForce3. It basically looks like what we are used to from NVIDIA’s drivers, with only but a few differences.
In the anti aliasing tab you can find the new setting ‘Quincunx’, which is the best FSAA-setting in terms of performance/quality. The 2x-setting is hardly any faster, but the quality is worse, while the 4x-setting hardly looks any better than ‘quincunx’, but costs a lot more performance.
The Direct3D tab offers the same functionality that we already know from other NVIDIA-drivers.
Driver, Continued
The same is valid for the OpenGL-tab. For benchmarking you should always turn off vertical sync, which is on by default.
The well-know cool-bits registry tweak enables this tab as well, which is mainly interesting to overclockers who are not satisfied with GeForce3’s normal performance and who don’t mind risking the stability of the card.
Test Setup
Platform | |
Motherboard | MSI K7 Master-S, AMD760 Chipset |
Processor | Athlon 1333, overclocked to 1466 MHz |
Memory | 256 MB Micron PC2100 CL2 DDR-SDRAM |
Hard Drive | IBM DTLA307030 ATA100 IDE FAT32/NTFS |
Operating System | Windows98SE for game benchmarks Windows2000 SP1 for SPECviewperf 6.1.2 |
Graphics Cards | |
NVIDIA GeForce3 | 64 MB, Driver 11.01, Chip Clock 200 MHz, Memory Clock 460 MHz (DDR) |
NVIDIA GeForce2 Ultra | 64 MB, Driver 11.01, Chip clock 250 MHz, Memory clock 460 MHz (DDR) |
NVIDIA GeForce2 Pro | 64 MB, Driver 11.01, Chip Clock 200 MHz, Memory Clock 400 MHz (DDR) |
NVIDIA GeForce2 GTS | 32 MB, Driver 11.01, Chip Clock 200 MHz, Memory Clock 366 MHz (DDR) |
ATi Radeon DDR | 64 MB, Driver, Chip Clock 183 MHz, Memory Clock 183 MHz |
Theoretical Numbers
It’s always good to have a look at the theoretical numbers provided by the chipmakers, just to have an idea how far truth can be from reality and how pointless those number games really are.
Memory Bandwidth | Pixel Fill Rate | Texel Fill Rate | |
NVIDIA GeForce3 | 7360 MB/s | 800 MPixel/s | 1600 MTexel/s |
NVIDIA GeForce2 Ultra | 7360 MB/s | 1000 MPixel/s | 2000 MTexel/s |
NVIDIA GeForce2 Pro | 6400 MB/s | 800 MPixel/s | 1600 MTexel/s |
NVIDIA GeForce2 GTS | 5856 MB/s | 800 MPixel/s | 1600 MTexel/s |
ATi Radeon | 5856 MB/s | 366 MPixel/s | 1098 MTexel/s |
If you look at those numbers, GeForce3 should be inferior to GeForce2 Ultra. Let’s see if this is indeed the case.
Quake 3 Arena
Id’s Quake 3 Arena may be a cult game, but dating back to 1999 it has become a bit old these days. Unless highest screen resolutions and true color are required, every upper-class 3D-card is today able to provide respectable frame rates and good quality in this game. Still, Quake 3 Arena remains one of the most important tests that a 3D-card has to pass.
I don’t know if anyone interested in a high-end 3D-card still cares about a screen resolution of only 640x480x16, but I provided benchmark results anyway. At this low resolution the local memory bandwidth doesn’t constrain the fill rate of today’s 3D-cards, which is why GeForce3 is not really able to show its muscles in comparison to GeForce2 cards. However, you can also see how small the performance impact of the quincunx-FSAA is on GeForce3’s frame rate scores.
The polygon and feature rich NV15-level demo shows all cards scoring pretty much on par. 640x480x16 is neither a challenge to GeForce3, nor to GeForce2 or Radeon.
Quake 3 Arena, Continued
At 1024x768x32 the picture changes. GeForce3 can easily outscore the rest, including the previous champion GeForce2 Ultra. At quincunx-FSAA, GeForce3 is still able to provide sufficient frame rates that aren’t even that much worse than the scores of Radeon DDR w/64 MB.
This benchmark is more of a processor test than limited by the graphics card. This is why the field is rather close together.
Quake 3 Arena, Continued
At the top resolution of 1600x1200x32 GeForce3 is once more able to leave everything else quite far behind. Only the quincunx-FSAA score of GeForce3 is too low to allow realistic playability.
At 1600x1200x32 the bottleneck moves from the CPU to the graphics card and here rather to its memory bandwidth than its fill rate. That’s why GeForce3 can get the lead here, while GeForce3 with quincunx-FSAA is still too slow to run at this high resolution, though not much slower than a normal GeForce2 GTS 32 MB.
Evolva
Evolva is a DirectX7 game that uses per-pixel lighting and dot3 bump mapping. The benchmark supplies you with the average as well as the minimal frame rate.
It’s interesting to see that all GeForce2 and the GeForce3 are scoring the same average frame rates, while the more important minimal frame rates differ quite a bit. At this low resolution of 640x480x16 GeForce2 Ultra can take advantage of its higher technical fill rate compared to GeForce3. At higher resolutions, where memory bandwidth is the limiting factor, GeForce3 is in the lead due to its more efficient memory interface. You can also see that GeForce3 with quincunx-FSAA enabled scores exactly half the frame rate of GeForce without FSAA.
At 1024×768 and enabled bump mapping, GeForce3 can once more declass the competition. However, quincunx-FSAA does not provide frame rates good enough to ensure proper playability.
Looking at the average frame rate scores shows GeForce3 as the only card that’s reasonably running Evolva at 1600x1200x32 with bump mapping enabled. However, the minimal score of only 16 fps makes it questionable if even GeForce3 is good enough for this challenge. You can obviously forget the quincunx-FSAA at this resolution.
MDK2
MDK2 is, like Quake 3 Arena, an OpenGL game that by the way doesn’t benefit from Pentium 4 as much as Quake 3 Arena.
The scores are similar to the Q3A-scores. Even with quincunx-FSAA enabled, GeForce3 is still scoring extremely respectable.
We already know the picture. At 1024x768x32 GeForce3 is able to leave the rest behind, while it can even compete with Radeon DDR 64 MB if quincunx-FSAA is enabled.
There’s no doubt that GeForce3 is a 3D-chip that finally makes 1600x1200x32 reality, although not quite with quincunx-FSAA enabled.
AquaNox AquaMark
AquaNox is an amazingly good-looking DX8 underwater 3D-action game from the German game developer Massive Development. The ‘Krass 3D Engine’ (what a kewl name! Voll krass, eh!) is using a lot of GeForce3’s new features and is therefore a perfect benchmark to show GeForce3’s advantages over previous 3D-chip designs. Here’s a short excerpt from the AquaNox website:
“Massive Development has been working closely with NVIDIA to expose the awesome power of ‘GeForce3’ in their forthcoming title ‘AquaNox,'” said Sanford Russell, Senior Director of Partner Management at NVIDIA.
“Geforce3’s nfiniteFX(tm) Engine has enabled Massive Development to bring real time Hollywood quality to their game. Only by using NVIDIA’s hardware vertex shaders and pixel shaders, are they able to achieve their unique lighting solution at great frame rates.
‘AquaNox’ and its technology, the krass(tm)engine impressed us!
NVIDIA look forward to working with Massive Development again in the future to take games and gamers to the next level.”
Here you can download some beautiful screenshots of AquaNox (7.4 MB).
AquaNox AquaMark, Continued
We ran the benchmark version “AquaMark” with default settings.
You can clearly see how GeForce3 is leaving all the rest behind, even at this low-resolution and even with quincunx-FSAA enabled.
At 1024x768x32 GeForce3 is still leading, but even the new high-tech 3D-chip is having problems to provide respectable frame rates. Forget playing AquaNox at default settings with any of the non-GF3 cards.
While GeForce3 is still leading the pack at 1600x1200x32, none of the tested 3D-cards is able to run the game at playable frame rates, including GeForce3.
Dronez
Dronez is another great looking 3D action game that uses OpenGL and GeForce3’s new features. However, the results look a lot different to the AquaNox-scores.
Have a look at those Dronez screen shots(28 MB).
There is a wealth of different settings in Dronez:
I used the above four different settings for GeForce3, so I always kept bump mapping, texture cube map, register combiners and texture compression enabled. Unfortunately, bump mapping could not be enabled on Radeon, which obviously improved the frame rates that Radeon scored.
The results of the different settings with GeForce3 are a bit confusing:
You can see that at 640x480x16, the Dronez frame rates actually benefit from the enabled vertex shader, while at higher resolutions the frame rates are higher if the vertex shader is disabled. The texture shader improves frame rates when enabled, but not at high resolutions if the vertex shader is enabled as well. I hope NVIDIA can give me an explanation for this strange behavior.
Dronez, Continued
To be fair, I compared GeForce3 to the other cards with texture shader enabled and vertex shader disabled. Please keep in mind that the scores of ATi’s Radeon DDR 64 MB are overly high, because Radeon did not support bump mapping in Dronez.
At this low resolution, all cards are actually scoring the same results. Only Radeon benefits from the missing bump mapping work.
At 1024x768x32, the GeForce2 GTS and GeForce2 Pro cards are falling behind, but GeForce3 has only a very tiny lead ahead of GeForce2 Ultra.
At the super-high resolution GeForce3 is still providing playable frame rates, while GeForce2 Ultra falls behind a bit further. However, GeForce3’s lead over the high-end GeForce2 Ultra is not that impressive.
Vulpine GLMark
The German game developer Vulpine is providing a very professional and good-looking OpenGL benchmark, based on their game engine, that supports the new features of GeForce3. The name of this benchmark is ‘GLMark’.
The benchmark doesn’t look half as good on a non-GeForce3 card. This is a screen shot of the lake with a GeForce2:
You can easily spot the difference.
Vulpine GLMark, Continued
At 640x480x16 GeForce2 Ultra is a tiny bit ahead of GeForce3. ATi’s Radeon doesn’t look good in this benchmark at all.
At 1024x768x32 GeForce3 is able to separate from the rest of the field. Even the quincunx-FSAA enabled GeForce3 is able to score respectable frame rates.
The super-high resolution of 1600x1200x32 shows GeForce3 as the clear leader. Even with enabled quincunx-FSAA GeForce3 is able to provide frame rates as good as GeForce2 Pro. GeForce2 GTS is clearly suffering from its only 32 MB of onboard memory.
3DMark2001
I was never as much of a fan of MadOnion, but I have to admit that their new 3DMark2001 is an impressive advance over the previous 3D-benchmark 3DMark2000. The four gaming scenes of the first part of the benchmark are stunning. Especially the fourth demo (picture above), that only runs on GeForce3 right now, looks simply drop-dead gorgeous and proves how close we have come to virtual reality.
We ran the benchmark with its default settings (1024x768x32) for each card.
GeForce3 is clearly winning this benchmark, but it’s even more remarkable that GeForce3 with quincunx-FSAA enabled is scoring the same numbers as Radeon and GeForce2 GTS.
3DMark2001, Continued
3DMark2001 allows a closer look at the 3D-cards with its sub-results. I focused on the fill rates to show GeForce3’s real advantages:
We have heard a lot about the theoretical fill rates of 3D-chips, but what is always forgotten is that memory bandwidth never allows those high theoretical numbers, especially not at high resolutions and color depths. The above results show the REAL pixel fill rate at 1024x768x32. You can see how GeForce3 is outscoring all the other cards, even though GeForce2 Ultra has a higher theoretical fill rate, it falls clearly behind GeForce3 due to GeForce3’s new crossbar memory architecture and Z occlusion culling. Even GeForce3 with enabled quincunx-FSAA is able to compete against the other cards, simply because GeForce3 is utilizing its memory bandwidth (which has a peak bandwidth that is identical to GeForce2 Ultra) much more efficiently.
The texel fill rate numbers are similar to the pixel fill rate scores, but you can see how GeForce3 benefits from the architecture of its texture shader (allowing 4 textures per pass, 2 textures per cycle) when quincunx-FSAA is enabled and Radeon benefits from its three texture units per pipeline.
SPECviewperf Scores Under Windows2000
GeForce3 is not exactly meant for professional OpenGL-stuff. NVIDIA’s current product for this area is Quadro2, based on the GeForce2-architecture, and there will certainly soon be a Quadro3 based on GeForce3. We still thought it would be interesting to compare GeForce3 to the other cards under SPECviewperf 6.1.2.
GeForce3 is far from a clear leader in this benchmark. In fact, GeForce2 Ultra is able to beat GeForce3 in 4 of 6 occasions. We will have to see what NVIDIA will do to make Quadro3 a worthy successor of Quadro2.
Conclusion
You have seen the scores and looked at the screen shots. GeForce3 is certainly an impressive product that provides excellent frame rates at high resolutions and partly even at full scene anti aliasing. However, you’ve got to pay some hefty $400 for it, which will leave it out of reach for the majority of PC-users.
Due to the fact that this review is rather late, I had the opportunity to read several other reviews of GeForce3 at other websites. Most of my colleagues don’t consider GeForce3 worth its money right now, because there aren’t enough games available that support its new fancy features. I can only partly agree. Although the chip is still brand new and not even quite yet available to the public, I found already four different game engines that support its features. It seems quite obvious that the game developers are itching to use GeForce3’s new features in their upcoming games. It is certainly correct that 3D-games will remain playable without GeForce3’s new features, but I am sure that you will be able to benefit from GeForce3’s stunning graphics in games very soon.
It is also pointless to remind you of the next upcoming NVIDIA-chip that will succeed GeForce3 in six months. NVIDIA’s product cycle is half-yearly and so there will always be a better upcoming chip around the corner.
I personally recommend GeForce3 to all the ones of you who are really able to appreciate the new effects that GeForce3 can provide. There is no point in buying GeForce3 just for its higher frame rate scores in today’s games. The ones of you who don’t want to spend 400 bucks on a graphics card have no reason to feel bad though. Your current 3D-card will most certainly be able to run the 3D-games of the next 6-12 months just fine, especially if it has a GeForce, GeForce2 or Radeon based architecture and thus T&L. Without T&L you might be able to play today’s games, but I doubt that any of the new game engines is going to appreciate 3D-cards without T&L anymore. Keep that in mind if you are considering a Kyro2 card.
GeForce3 is anything but a must-buy, but the ones of you who can afford it won’t be disappointed.