Intel has promised that Hyper-Threading (HT) will deliver increased efficiency and decrease the amount of time taken to accomplish multiple processes. But is it really all that it's cracked up to be?
Understanding how Hyper-Threading works in the real-world requires a basic knowledge of how computations are done on modern PCs.
Computer applications use a series of sequential operations (or calculations) fed to the CPU to generate results. When you start a program, you start a 'thread' of calculations. As the vast majority of desktop and mobile computers only have a single CPU, most programs are optimised to feed a single thread of calculations to the processor.
CPUs crunch numbers. The more numbers they can crunch, the more efficient they are. Any computation done by a CPU moves through that processor's 'pipeline' and is spat out at the end complete. The P4 breaks down its processing into a particularly long 20-stage pipeline (whereas the Athlon XP, for example, has a 10-stage pipeline).
Each stage of a pipeline takes up one clock cycle, although with 20 stages, 20 individual tasks can be perfomed at once. It is not foolproof though, and it often needs to be flushed and have all the calculations start again.
Hyper-Threading is a method to squeeze more efficiency out of the Pentium 4's long pipeline by filling it with more operations.
A Hyper-Threaded P4 has physically duplicated the part of its architecture used to initiate threads: certain registers and the APIC. This results in two 'logical processors' (LPs); the operating system effectively sees the Hyper-Threaded CPU as two separate CPUs and therefore willingly feeds each LP its own thread.
The logical processors then work in concert to bundle together operations from these different threads, so they can then be executed in the same pipeline.
It's important to note that Hyper-Threading's two logical processors don't actually split the entire CPU into two segments – it's only at the initial stage where this occurs. The back end and main 'number crunching' part of the CPU remains largely untouched.
This is all well and good in theory. However, given that the performance of Hyper-Threaded Xeons (the P4 for servers, which has had HT enabled since early 2002) was shown to drop by up to 25% in some single-threaded applications, we were somewhat skeptical about the new P4.
We benchmarked the 3GHz with HT both enabled and disabled, and ran single-threaded and multi-threaded tests to compare the results.
Performance varied significantly depending on the tasks being performed. With single-threaded applications, the difference between HT enabled and disabled was miniscule, and there was no clear winner in most benchmarks.
One of our primary tests was video encoding, as this makes good use of many system resources and stresses most parts of the computer.
While encoding a video using TMPGEnc (a popular shareware encoder) in the background, we then opened a second iteration of TMPG and began encoding another video. Testing conditions were exactly duplicated for each CPU, and timed using a stopwatch.
In our single-thread tests, converting 10 minutes of video took almost exactly six minutes with HT both on and off.
With a second iteration running in the background, this increased to over 12 minutes with HT disabled – so no time at all was saved by attempting two encodes at once. With HT enabled, we saw a significant drop in the overall processing time – it managed to save three minutes of processing time, or about 25%.
We also ran 3D benchmarks with a video encoding in the background. Although performance dropped in both instances when compared to our standard tests, the Hyper-Threaded P4 was significantly faster than its counterpart – up to 35% in 3DMark2001 SE.
One curious aspect of Hyper-Threading sprang from the fact that we initially tested the 3GHz under Windows XP without Service Pack 1 to see if it would make any difference (you need XP SP 1 installed to enable HT). With Hyper-Threading enabled, Direct3D performance consistently dropped by 30% in 3DMark2001. This glitch disappeared with SP 1 installed, showing that Microsoft has made significant changes to its multiprocessing kernel in order to accommodate Hyper-Threading.
Like many of Intel's past initiatives, if Hyper-Threading suffers one significant flaw it's its exclusiveness. It relies on the latest technology in order to work correctly, effectively cutting it off as an upgrade choice for most current P4 owners. To use it, you'll need one of the latest Intel chipset motherboards (third-party manufacturers like SiS and VIA haven't been privy to the instructions required to unlock HT – yet), and this motherboard will likely also need to be flashed with the latest BIOS.
Although it's been shown to work well under Linux, the only OS that has official support for HT is Windows XP – which even cuts out Windows 2000. Moreover, as we discussed above, without Service Pack 1, Hyper-Threading causes serious issues with Direct3D performance as well.
For most day-to-day applications there won't be any appreciable difference between running a PC Hyper-Threaded and not Hyper-Threaded: downloading multiple files while watching
a video is already a seamless (and flawless) experience on modern PCs, so the benefits
of Hyper-Threading to most people won't be immediately apparent.
It's really the power users that will get the most out of it. 3D animation, desktop publishing, digital video editing etc. will all see performance improvements. Gamers won't see as much of a boost since a single game won't significantly utilise HT. Given that, with the increase in the amount of multitasking that goes on these days, with things like background antivirus, all users should see at least a slight boost in performance. It's not without its bugs though, as you can read about below.