Over recent years graphics processing has become more and more important on the PC. Processors that were once only used to make games look pretty are being utilised in more and varied ways, carving out a role that could become as key as the CPU is.
This has come about because Graphics Processing Units (GPUs) are incredibly good at dealing with some types of processing workloads. GPUs are effectively massive arrays of floating point processors, which complement the CPU’s integer focus. This hasn’t always been the case, though – over the past 15 years, GPU development got to this point through a series of evolutionary steps.
15 years of rapid evolution
3dfx’s Voodoo Graphics from 1996
It was indeed only 15 years ago that 3dfx kickstarted 3D gaming with the launch of the Voodoo Graphics chip. Computers in 1996 had graphics cards (usually an S3 Trio), but these were purely designed to let the PC output to a monitor. The CPU was where the processing happened, and any attempts at 3D graphics were done using CPU-friendly techniques like Voxel Shading (made famous by the early Novalogic game, Comanche). With the rising popularity of first person shooter titles, a bunch of companies came out with ‘3D Accelerator’ add-in cards, which sat alongside the 2D card and were designed solely to output rudimentary 3D environments.
At the time, quality 3D rendering was the domain of companies like Pixar and Silicon Graphics. It was done on massive computing clusters, using techniques like Ray tracing to emulate the effects of lighting on objects. This type of rendering would take hours or even days per frame and was completely unfeasible for gaming, where tens of frames needed to be rendered every second.
There were several ‘real time’ rendering techniques that were pushed in the early days. STMicroelectronics pushed Tile Based Rendering and Bitboys had its own quad-based renderer, but it was triangle-based rendering that won out. This technique involves breaking a 3D scene into triangles, which are then used to calculate lighting and other details, and it still forms the basis of real-time 3D graphics today.
At first the various 3D accelerator players tried to push their own software development APIs. Unfortunately this meant that in order to support a certain brand of hardware, game developers would either have to double up development time or ignore other brands. This wasn’t sustainable and eventually the industry settled upon OpenGL as a standard programming language, thanks largely to the efforts of id Software’s John Carmack, who went with OpenGL for the 3D accelerated version of Quake.
OpenGL provided a great common ground for the early phases of 3D graphics. During this time a lot of the pioneers of 3D fell by the wayside, leaving Nvidia and 3dfx to battle it out for the gamer dollar. Thanks to some bad business decisions 3dfx went bust in late 2000, its assets ending up sold to Nvidia. This was around the time that Nvidia’s hardware was beginning to outpace the slow advances being made in OpenGL – the GeForce 2 series of graphics cards incorporated the entire OpenGL instruction set in hardware, which left NVIDIA with no clear way forward if it wanted to advance its technology.
The rise of GPGPU
This led to what has become a watershed moment in the history of GPU computing. Firstly NVIDIA moved to focus on Microsoft’s DirectX graphics API, leaving OpenGL behind. Then it moved from an architecture designed to accelerate a fixed set of functions (known as hardware transform and lighting) to one designed to be programmable. It also marked the moment when ATI went from relative 3D unknown to major player with its first Radeon cards.
In the ‘programmable shader’ world developers were given freedom to write their own functions, and the processors evolved to become a collection of processing units. The rendering pipeline started with a vertex unit, which broke a scene into triangles and calculated basic information based on these. This triangle information was then fed to a series of floating point processors, known as pixel shaders, which calculated how each pixel on the screen should look. With this design the 3D accelerator evolved into the Graphics Processing Unit (GPU was actually an Nvidia marketing trademark initially, but it has since become the commonly accepted term for graphics processors).
These first pixel shaders were fairly rudimentary by modern standards, but they got some academics thinking. These highly customised floating point processors were actually more efficient than CPUs when faced with certain workloads. As newer versions of DirectX were released the shader functionality grew, and the number of shaders inside a GPU began to increase. This led to more and more interest in what was becoming known as GPGPU (General Purpose GPU) computing.
Why not use the CPU?
A CPU core is designed to deal with sequential workloads. It completes an instruction and goes on to the next one. This has become more efficient over the years thanks to things like multicore processing, branch prediction and out-of-order execution, but it is still what a CPU does best. A GPU, on the other hand, excels at parallel processing, where large amounts of independent data need to be worked on.
Parallel tasks include large database queries, video conversion, physics modelling and scientific computing. These are workloads where there is a lot of number crunching to be done but the results are largely independent of each other.
This emergence of GPGPU computing has been a massive influence on the PC industry, even if GPGPU has yet to truly take off. It drove AMD’s 2006 acquisition of ATI and it has been the main design driver behind Nvidia’s latest Fermi graphics architecture. There are now two major GPGPU programming API’s out there – OpenCL from the Khronos Group and DirectCompute, which is part of Microsoft’s DirectX. Nvidia also has its own proprietary language called CUDA, designed for GPGPU applications on its GeForce graphics cards.
GPGPU processing has manifested in some mainstream programs. The most prominent of this is the latest release of Adobe’s Premiere software, which is built around a CUDA-based rendering engine called Mercury. This allows PCs with recent GeForce hardware to run multiple HD video streams in realtime without the need for rendering. The difference between Premiere running on CUDA and it running on a CPU is like night and day.
Why does this matter?
Nvidia may have some design wins with CUDA, but AMD is betting on OpenCL and DirectCompute moving forward. Its Fusion APUs all contain graphics hardware capable of running these APIs and it is putting serious effort and money behind them, believing that open standards will win out eventually.
Intel on the other hand has largely ignored the GPU and focused on CPU. The new Sandy Bridge processors contain what it calls ‘processor graphics’, which don’t support the latest DirectX 11 standards. Intel has confirmed that future revisions will support DX 11 but given past history, this won’t necessarily be the same implementation as Nvidia’s and AMD’s ones. Instead Intel is focusing heavily on CPU extensions called AVX (Advanced Vector Extensions) that allow for speedy, high-precision floating point processing. AVX forms a key part of Intel’s impressive Quicksync transcoding technology, which is analogous to GPGPU processing but is still very CPU-based.
In contrast, AMD’s Fusion APU design actually has a cut-down version of its DirectX 11 GPU built onto the Silicon itself. So anything written in OpenCL or DirectCompute will run directly on this hardware. This means that unlike traditional CPUs, APUs could well get ‘faster’ as time progresses and more software emerges to take advantage of the inbuilt GPU. The ideal end point in AMD’s mind is the emergence of software or drivers that can analyse a piece of software and dynamically assign processing loads to the CPU or GPU cores based upon what type of processing is most suitable. This is still a year or two away, but it is a goal that is being actively chased by AMD and software developers.
More than ever the line between CPU and GPU is blurring. Whereas once GPUs were something that only gamers wanted, now they open up a much wider range of experiences. With the advent of AMD’s APU you don’t even need to buy a graphics card to take advantage of GPGPU programs, and even Intel is moving towards support for DirectCompute in future processors.