Nvidia’s latest graphics cards and chips throw down the gauntlet to Intel and AMD. Go behind the scenes with this guide to the big technologies including PhysX, CUDA, and games like Far Cry 2 and S.T.A.L.K.E.R Clear Sky.To hear Nvidia tell it, integrated graphics just aren’t going to cut it, and a discrete GPU is still vital. They point to the 87% of top PC games with a recommended spec above the Intel integrated graphics specification to support their claim.
More than ever, a CPU and GPU work in concert, so that an optimised configuration of 256MB GeForce card and dual core processor will outperform a quad core with a 128MB GeForce card. In other words, the GPU doesn’t have to limit itself to gaming, and that’s where a whole raft of new initiatives from Nvidia step in to do a polished song and dance number.
A simple example of how the GPU can go beyond gaming is a little app called PicLens, by Cooliris, that displays Google, Flickr, Youtube and Deviantart image searches as an interactive 3D wall that you can visually skim, pause, play a video thumbnail, or flick back and forth within coverflow-style. A GPU adds piclens motion blur and antialiasing, as well as much more power.
GeForce GTX200 and beyond
The GeForce GTX200 series, launched mid- June, incorporate the second generation unified architecture from Nvidia, but they are also parallel processors with 1.4 billion transistors, providing just under a teraflop of power from 240 processor cores. The first two cards to be released – the GeForce GTX260 and GTX280 – won’t be cheap, but from what we’ve seen, they’re immensely powerful.
|Far Cry 2 uses a new engine called Dunia, designed to take advantage of the new GTX200 cards. (click image to enlarge)|
Tony Tomasi, Nvidia’s Vice President of Technical Marketing says it’s the largest, most powerful and most complex GPU ever made by chip manufacturer TSMC. Its complexity is exemplified by its two distinct modes; one dedicated to computation, and the other to graphics processing.
Around 80% of the GPU is dedicated to parallel computation, and the processor is designed to maximise throughput. Each of the 240 single-instruction, multiple thread (SIMT) cores is scalable and can communicate on-die, rather than having to go out to the memory system. Eight cores are grouped into a streaming multiprocessor with 16KB shared memory. That shared local memory is available to the programmer, so that the GPU can be optimised for different tasks. Three of those multiprocessors, together with L1 cache, creates an array, and there are 10 arrays that make up the GPU, along with a thread scheduler to manage the threads, and a 512bit memory subsystem.
Curtis Beeson, engineer at Nvidia, demonstrated the second personality via a graphics showcase. The latest iteration is a story-based demo, featuring a warrior facing down a Medusa (and coming to a stony end). The key features for the GTX200 series processors are new lighting effects, more photorealism, more than three million triangles per frame, improved DirectX 10 features such as geometry shading, and – in the demo we saw – hardware-generated petrification and transformation effects.
|The GeForce GTX280 may look unassuming, but it packs a powerful punch|
Tony Tamasi says that for graphics processing, the same basic elements that make up the parallel processor then have, in addition, a variety of specialised shaders, improved texture performance, a 1GB frame buffer and increased shader to texture ratio – all of which should make for cinematic quality gaming. Tomasi says Nvidia is aiming to balance shading and textures with floating point detection: “Focusing on one without the other can lead to awesomely fast DirectX 9 performance, but no real improvement for DirectX 10, so we balance it.”
Another thing Nvidia has been working on is power efficiency, trying to ensure that when a feature isn’t needed by the GPU, it uses no power at all. The GTX200 series, as a result, has more gradations of power available, so that the cards consume about 25W when idle, 32W while playing a Blu-ray disc, and 147W while running an intensive benchmark such as 3DMark06. For comparison, the GeForce 9800GTX uses around 45W while idle, 50W for Blu-ray and 80W for 3DMark06.
Tomasi also points out that 25W usage while idle isn’t too much more than the motherboard GPU generally uses. “If we can get our power low enough,” he said, “then you’ll get to a point where the discrete GPU uses less power than the motherboard GPU.”
The games to come
Nvidia acquired PhysX only 4 months ago, but within a month PhysX was running on GeForce, and it’s now incorporated into the new GTX 200 series GPUs.
PhysX is currently the only API that runs on both CPUs and GPUs, and it’s programmable using CUDA (see opposite). For PhysX, being part of Nvidia has meant a massive increase in the number of games signing up – more in a single month than in the previous two years as Ageia. For Nvidia, it means they can offer more to game designers and level designers. In the works are tools that increase the consistency between the modelling environment and the final game engine, and to help the creation of in-game objects and behaviours. This should all lead to richer games, even from smaller studios without massive design budgets. The first drivers porting across to the GeForce will be for the Unreal Engine, so if you run games based on that engine, you should see the influence of PhysX straight away on GeForce GTX200 series graphics cards.
The goals for the team behind Far Cry 2 is to not just have great static screenshots, but also to have the best looking dynamic beauty. The new installment is set in Africa, with lots of exterior environs and unlike most games, you really can go anywhere. Everywhere within the game is high resolution as you step up close to it – not just the plot-related areas.