Following on from the success of the Riva TNT chipset, nVidia has released its latest foray into the highly competitive world of 3D graphics with the GeForce 256, previously known by the code name NV10. While 3dfx, ATI and S3 are all working on next-generation chipsets, none of them are expected to be released until at least early 2000, giving nVidia a several month jump on the competition. The GeForce incorporates several new features that make it a significant advancement over previous chipsets, and not just an evolutionary step forward. The most noticeable thing about the GeForce is that nVidia is not referring to it as a graphics chipset, but as a GPU, or Graphics Processing Unit, likening it to a specialised CPU. The justification for this title comes from the fact that the GeForce is manufactured using 22 million transistors on a 0.22 micron press, which is as many as the latest Athlon and Coppermine CPUs. It also features dedicated transform' and lighting' processors in the chip, meaning the chip takes over a significant portion of the 3D graphics pipeline from the CPU. Previously graphics cards have only handled triangle setup (where the co-ordinate information from the CPU is converted into the polygons on screen) and rendering (where textures and special effects are applied to the scene).
With the addition of the computationally intensive transform and lighting processes being performed by the graphics chip, the CPU is freed up for other tasks like better game physics or artificial intelligence (see Transform and Lighting'). This also makes the chip less CPU-dependent, meaning if the software utilises the transform and lighting engines, then it should run at a similar speed on a Celeron as it does on an Athlon. Unfortunately, while the GeForce supports the DirectX 7 and OpenGL transform and lighting engines, software needs to be specifically written for these engines, or for the chip itself, in order to utilise the hardware acceleration. While several upcoming titles will get some performance increase out of the chip, such as Quake III: Arena and Experience from WXP, current games will see little or no performance benefit from the integrated transform and lighting engines.
Another interesting feature of the GeForce is the number 256' after its title. Contrary to first appearances, it does not mean that the memory bus has been increased from 128-bit to 256-bit, but it refers to the fact that the GeForce has four 64-bit rendering pipelines, making for a total of 256-bit. This means that the GeForce can potentially output four pixels per clock cycle, compared to the two pixels delivered by 3dfx Voodoo3 or by the TwiNTexel engine on the TNT. This results in an increase in the maximum fill rate from the 300Mpixels/sec of the TNT2 Ultra to 480Mpixels/sec for the GeForce. Maximum polygon throughput has also been increased to 15 million polygons/sec, compared to the 9 million polygons/sec of the TNT2 Ultra. Other technological improvements include support for AGP 4X and a new feature called Fast Writes, which means the CPU can write data directly to the graphics card through the AGP bus without having to go through main memory first. Currently no chipsets support AGP 4X, although support will arrive with the Intel 820 chipset and future AMD chipsets. Other new features include Cube Environment Mapping, for vastly improved environmental reflection effects, vertex blending which joins polygons together in flexible objects like 3D polygon characters, and particle systems which can render fancy water and explosion effects.
There will ultimately be two versions of the GeForce, differing in the type of RAM that they utilise. Current cards will sport conventional 5.5ns or 5ns SDRAM although future cards will feature the new DDR (Double Data Rate) SDRAM which sends information on both the up and the down stroke of each cycle resulting in double the effective bandwidth. The RAM frequency is only 166MHz, compared with the maximum of 183MHz of the TNT2 Ultra, which means that until the DDR SDRAM versions are available, the GeForce will actually have a slightly lower memory bandwidth than the TNT2 Ultra. Both of the cards on test here are 32Mb SDRAM versions of the GeForce, although the Asus AGP-V6600 uses 5.5ns SDRAM while the Leadtek WinFast uses the faster 5ns memory. This means that the Leadtek RAM can actually run at a theoretical maximum of 200MHz while the 5.5ns RAM on the Asus is limited to a maximum of 183MHz, although both are still capable of higher speeds than the set frequency of 166MHz. Incidentally, the clock speed of the GeForce core is only 120MHz, although with the architectural improvements and new features, the chip can still deliver significant levels of performance compared to current higher frequency chips.
Besides the RAM speed and the inclusion of a TV output on the Leadtek, the two cards on test are very similar in terms of features and performance. Both of the cards use drivers based on the latest nVidia Reference drivers, although the Leadtek drivers do include some nifty features like an overclocking utility. Since the Leadtek card uses 5ns RAM capable of 200MHz, it is possible to push the frequency of the core and the RAM up a little in order to squeeze a little more performance out of the card. I was able to get it to run stably in 3DMark overclocked with a 125MHz core and 200MHz RAM - with a corresponding 5 per cent increase in performance. Other than these features the only difference is in the bundled software. The Asus card comes with two games: Turok and XG2 from Acclaim, while the Leadtek comes with its usual pack of graphics and Web development tools and DVDMagic.
In order to test the GeForce, I ran the two cards through the two most gruelling 3D benchmarks available for Direct3D and OpenGL, as well as one from nVidia designed specifically to utilise the transform and lighting engines on the chip. For Direct3D, I used 3DMark99 Max from Futuremark (www.3dmark.com), and while it doesn't yet support the new DirectX 7 features like transform and lighting, it does give a good indication of the performance of the GeForce in current DirectX 6.1 applications and games. I tested the two GeForce cards at 1,024x768 at 16-bit colour and ran them against a top of the line 3dfx Voodoo3 3500 and a Diamond Viper V770 Ultra, based on a Riva TNT2 Ultra chipset, on a Pentium III/500 and on an Athlon/650, both with 128Mb of RAM.
Interestingly the GeForce cards performed considerably slower in Direct3D than both the Voodoo3 and the TNT2 Ultra on the Pentium III and the Athlon. The margins on both systems was the same, with the GeForce cards only getting around 80 per cent the score of the TNT2 and Voodoo3. The GeForce cards were tested using the latest manufacturer's drivers and the latest nVidia reference drivers version 3.53, and the scores were not significantly different. This surprising result can be accounted for by relatively immature level of the drivers, the slightly lower memory bandwidth and the fact that 3DMark doesn't utilise any of the new features of DirectX 7 that the GeForce supports. This indicates that until the drivers are a little more mature, the GeForce will most likely perform slightly slower than Voodoo3 and TNT2 cards in current games using DirectX 6.1. When software is released that utilises the transform and lighting engines in DirectX 7, then the results should be significantly improved.
For OpenGL, I used Quake3 Arena Test 1.08. While it is not the final code, is one of the only games to utilise some of the OpenGL transform and lighting features, and it also has a very high polygon count compared to many other contemporary games. This test was run at a Normal, High Quality and a custom setting of High Quality at 1,024x768, and I used the built in benchmark utilising the q3demo1 map. In this test the GeForce performed noticeably b
This article appeared in the January, 2000 issue of PC Authority.
Be the first to comment on this article.