AMD vs Intel - here's the hammer blow

David Fearon | Feb 11, 2008 11:52 AM
We've put AMD and Intel’s new processors to the test, and there's one clear winner.

It’s no secret that for the whole of 2007, Intel – in particular, its desktop and mobile Core 2 processors – dominated the CPU scene. But now, the first of AMD’s quad-core processors have arrived, as have Intel’s next-generation parts.

The goalposts have shifted, too: where once the only thing anybody cared about was the outright speed of a processor, the past 12 months have seen a massive shift towards the green credentials of a CPU. Power consumption and performance per watt – the actual amount of work a processor can get done with a given amount of power – are now considered as important as performance.

Both companies are trying to convince anyone who’ll listen that their parts are king of the energy-saving hill, particularly when it comes to total-platform power consumption: the overall power drawn from the mains by a system based on the processor, chipset and supporting components of either company.

This month, we’ve cut through the hype and actually tested Intel and AMD’s newest parts for outright performance, plus performance per watt and overall value. We’ve moved heaven and earth to get cutting-edge parts to test from both companies, but as you’ll see, one of the two is in serious trouble when it comes to delivering on its promised new products.

This will hopefully change in the coming months, but for now there’s one clear winner and one that simply can’t cut the mustard.

Next page - Desktop: AMD's Phenom (click below)...Phenom
Phenom CPUs remain compatible with the AM2 socket
Phenom CPUs remain compatible with the AM2 socket
AMD’s first desktop quad-core CPU has had a difficult gestation. It was released almost a year late and, as yet, there are only two working models. The Phenom 9500 is a quad-core part running at 2.2GHz, the 9600 at 2.3GHz and the top-end 9700 gives the current top speed of 2.4GHz. It was originally slated to debut at 2.6GHz, but AMD’s production issues have caused problems even with the 2.4GHz part. AMD, however, managed to get us a 2.6GHz part that’s yet to be released. We also tested the mid-range Phenom 9500.

Power consumption
Being based on 65nm fabrication technology puts Phenom at a disadvantage. For a given clock speed, a 65nm transistor in a Phenom consumes more power than an Intel 45nm metal-gate transistor. That’s not the end of the story, though: power consumption of a transistor is proportional to frequency, so Phenom’s lower clocks are a bonus in that respect. And, all other things being equal, a CPU with a greater number of transistors consumes more power. A Phenom boasts a complement of 460 million, but an Intel Penryn quad-core part nearly doubles that, with an amazing 820 million transistors.

As with the Athlon 64 generation before it, Phenom processors are directly connected to main system memory by a dedicated on-chip interface. That means an AMD-based system doesn’t need a separate north-bridge MCH (memory controller hub) and all its associated supporting components. From a “total-platform” power-consumption point of view this is a good thing.

But it does limit your choice of memory technology. AMD hasn’t embraced DDR3 memory with the new part, and DDR3 support isn’t slated to appear until the next-generation platform – currently codenamed Stars – arrives in 2009. Intel-based motherboards already have DDR3 support, bringing power-consumption advantages since DDR3’s supply voltage is lower than DDR2, at 1.5V compared with 1.8V. That may not sound like much, but it means a power reduction of 30%. Again, though, DDR3’s higher clock speeds can offset the power savings.

How we tested
When testing power consumption for each platform, we’ve kept the playing field as level as possible. Power-supply efficiency can have a huge effect on total power consumption, so we used identical Antec 450W models for both AMD and Intel rigs. We also used identical Western Digital Raptor hard disks, and motherboards from the same manufacturer (Gigabyte) with very similar power-regulation circuitry design. Performance-per-watt scores are relative figures based on the benchmark scores compared with maximum power consumption.

Performance
As we saw with the 9500, the performance of Phenom leaves it trailing behind a similarly priced Core 2. In our test setup, using identical supporting components to the Intel rig, our overall benchmark result was 1.40; a good score, but far from stellar. A cursory glance at the specifications of the Phenom reveals why: just 4MB of cache in total, and a clock speed for the Phenom 9500 of only 2.2GHz makes it look a lot like a processor from a year ago as opposed to the cutting-edge design it’s supposed to be. Phenom’s novel three-tier cache structure means it’s well set up for the future, though. With 512KB of cache dedicated to each core, new multi-threaded apps should benefit.

Value
The street price for a Phenom is impressively low for a brand-new part, but then it has to be to compete with Intel’s price-slashing. The Phenom 9500, at about $260, is competing with the faster Core 2 Quad Q6600. Prices have yet to be announced for the engineering-sample 2.6GHz Phenom we managed to get hold of.

Prospects
Since the launch of Athlon 64 in September 2003, AMD has traditionally launched an enthusiast-level FX part, countered by Intel with its Extreme-edition series. But with all the production problems afflicting Barcelona, there’s no sign of them as yet. With its relatively low complement of cache making it unable to compete with Intel on that score, AMD desperately needs to improve its production to enable higher frequencies for better outright performance.

Phenom Benchmarks
click to view full size image
Click to enlarge


Next page - Desktop: Intel Core 2 (click below)...

Core 2
click to view full size image
Intel's Core 2 range continues with the LGA 775
While AMD proceeds with its relatively new 65nm manufacturing process, Intel has already left 65nm behind in favour of 45nm: it’s currently the only semiconductor manufacturer in the world able to produce 45nm chips in volume. This puts Intel – on paper at least – a product generation ahead of Advanced Micro Devices.

Power consumptionAnyone with any experience of Intel-based PCs is likely to have not one but three pretty large heatsinks on their motherboard. The first is for the processor, of course, but the other two are for the supporting chipset, with one on the south bridge and often a larger one on the north bridge, which incorporates the MCH. If you’ve ever been foolhardy enough to touch a passive north-bridge heatsink on an Intel system, the resulting burn on your finger will tell you they consume a significant amount of power.

It isn’t at the same level as a CPU, but the combined power consumption of the chipset can reach 15W or so – more than the total for some complete VIA systems (page 88). However, that’s offset by the impressively low power consumption of Core 2 processors, particularly the 45nm-based QX9650, despite its hefty 126W rated TDP (thermal design power).

The QX9650 is the only 45nm desktop processor Intel currently offers. With its massive transistor count and 3GHz clock speed, power consumption when the processor is maxed out with all four cores at 100% utilisation should be very high. In practice, its raft of intelligent power-saving measures meant the average power consumption of our test system was an impressive 98W at idle, and 160W at full pelt. Although this is higher in absolute terms than Phenom, the much higher benchmark score gives a superior score in terms of performance per watt.

Performance
There’s little doubt that Intel still comprehensively rules the roost in terms of outright, money-no-object speed. Even disregarding the design finesse that Intel’s processor engineers have endowed the Core 2 range with, the company has pulled ahead by throwing lots of very tiny transistors at the problem. The result is that the QX9650 has three times the total cache complement of the Phenom 9000 series, with 12MB L2 cache in total split between the two separate pairs of cores in the physical package – 6MB per pair. And with up to 100% of each 6MB chunk available to one core at a time, single-threaded code – which still comprises the lion’s share of performance-hungry applications – gets a massive boost. That said, there are diminishing returns compared to the 8MB total (4MB per pair) in the previous-generation quad-core parts, including the Core 2 Quad Q6600.

The Extreme Edition QX9650 may cost more than $1200, but given its benchmark result of 2.27 overall, there are plenty of people who’d deem it well worth the money. Even the Core 2 Quad Q6600 – now only about $300 – still outperforms the Phenom 9500 in our benchmarks, giving Intel the edge on the desktop in every possible category: bang per buck, performance per watt and outright speed.

Prospects
Intel didn’t let up when it pulled ahead of AMD with the release of the Core microarchitecture. It’s driven a massive wedge into the performance gap and carried on hammering, resorting to something akin to strutting where its newest processors are concerned. We’ve already seen its next-generation QX9770, which extends the lead even further, albeit at the expense of slightly lower performance-per-watt figures. The 9770 runs at 3.2GHz and sports the same 1600MHz FSB as the newer Xeon parts. It’s a cheeky move on Intel’s part to send samples out to press for testing without giving a release date or pricing, but it certainly serves to highlight – or give the impression of highlighting – the technology lead it holds.

Rumours have even surfaced that Intel will delay releasing its 45nm mainstream quad-core processors, simply because AMD isn’t able even to match the existing 65nm designs. That sort of rumour deserves to be taken with a pinch of salt, but there’s little doubt that the replacement for the current desktop 65nm quad-core parts are waiting in the wings and will be unleashed early this year.
On top of that, Intel has already publicly discussed the move to six- and eight-core CPUs in 2008-2009.

Core 2 Benchmarks
click to view full size image
Click to enlarge


Next page - Server: AMD Opteron (click below)...

Opteron
click to view full size image
Boston sells near-identical AMD and Intel rack servers.
In theory, testing the relative performance of Intel and AMD for enterprise-level applications should be simple. Server componentry tends to be standardised and use similar design. In practice, we weren’t able to do so.

Our test servers were provided by Boston in the form of two Supermicro 1U rack chassis: an 102IM-T2B quad-core Opteron-ready system, and a 6015B-TV for the Xeons. The two sport identical basic chassis designs, identical hard disks, identical power supplies and even identical CD-ROM drives; no problems there. When it does finally arrive, the quad-core Opteron server and workstation range is slated to be broader than the limited number of Phenom models. As with the dual-core Opteron range and Xeon (below), the range is split into dual-processor and multiprocessor (up to eight-way) ranges, dubbed the 2300 and 8300 series respectively. Initially there will be five 2300s in the range, with two low-power HE variants with a TDP (thermal design power) of 55W as opposed to the standard 75W. Clock speeds vary from 1.7GHz to 2GHz . The 8300 series will comprise four models including two HE versions, ranging from 1.8GHz to 2GHz, with the same 75W TDP for standard and 55W for HE versions.

The design of the quad-core Opteron itself is more or less identical to that of the Phenom, with the same three-tier cache arrangement of a fixed 512KB L2 and shared 2MB L3. The only significant difference is an extra HyperTransport link for processor-to-processor communication. AMD claims this is a particular boost for a server system: with a multiprocessor Xeon server, traffic between processors has to travel out onto the front side bus, which increases latency and sucks up bandwidth that could be better used for unhindered memory-to-CPU traffic. Bandwidth of the HyperTransport links is impressive, with 8GB/s on each link making it very difficult to saturate the system.

As with Phenom, the memory support for quad-core Opteron is limited to the design and architecture of the CPU itself, with the directly integrated memory controller rather than an off-board MCH. But a final hardware difference between them is the type of memory supported. Although, like Phenom, you’re forced to stick with DDR2, Opteron allows for ECC (error-checking and correction) RAM for better resistance against single-bit memory corruption. Maximum memory speed is 667MHz. Because of the direct connection between main memory and the CPU, the total RAM complement needs to be split between two banks, with half accessed by each processor. The Boston server is fitted with two banks of four slots, allowing for a maximum total of 32GB.

Our issues arose from the fact that despite quad-core Barcelona Opterons nominally having launched in September last year, there were still no processors available. AMD itself told us that it would love to send us some but simply couldn’t get hold of samples. Calls to vendors met with a universal chorus explaining they hadn’t received any stock from AMD.

The explanation for the bizarre state of affairs – a processor company being unable to provide samples of its own processor – came with an official announcement on 14 December 2007. At an AMD financial analyst summit, the company’s chief executive Dirk Meyer admitted that the launch of Barcelona had been horribly botched, and revealed that a design error had meant production was temporarily stopped. The error in question is with the processor’s TLB (translation lookaside buffer). A TLB is nothing new – all modern processors use them to perform address-map translation. Errors in chip designs are nothing new either, but this one had managed to stall a processor that was already plagued by production delays. The fix AMD has now implemented is based around a BIOS update, and the underlying issue won’t be fully fixed until the next revision of the design, the B3 stepping.

In his opening statement, looking like a man who hasn’t had much sleep since AMD’s share price began sliding to 68% below its level at the beginning of January 2007, Meyer said: “Last month [November 2007], while we were in the final stages of system validation, we uncovered a design error... we have delayed general availability of our Barcelona server product until next quarter [Q1 2008]”. He added that the error is “sensitised under very obscure operating conditions”. The result of this is that quad-core Opterons, rather than being generally distributed to vendors, were being shipped to “large cluster installations where we can be sure that the error won’t be sensitised”.

According to AMD, full production of quad-core Opterons has now restarted. We’re going to keep hold of the server chassis and, as soon as we get production samples of the new processors, we’ll test them head-to-head against Intel to get the definitive picture as to which is best. Despite the lack of Barcelona this month, we’ve gone ahead and looked at the Intel-based server and assessed it on its own terms for performance and power consumption.

Next page - Server: Intel Xeon (click below)...

Xeon
Intel has been no stranger to chip errors and bugs in the past – the biggest being the infamous 1994 Pentium floating-point division bug, which, after a public outcry, forced the company to offer replacements for affected CPUs that had made it on to the market. But none of that has been a problem when it comes to the latest Xeon workstations and server CPUs. We obtained a pair of X5460 Xeons: 3.16GHz, 45nm quad-core parts that are based around the same design as the desktop QX9650 processor, with 12MB L2 cache and a 1333MHz FSB speed.

Boston tells us it will soon be supplying a version of the 6015B-TV chassis that’s able to use the upcoming X5482 and X5472 Xeons, which can run with an FSB speed of a whopping 1600MHz. In a desktop system, the FSB is almost never fully saturated, but server and workstation workloads that shovel huge amounts of data between main memory and the CPU should benefit.

Power consumption
As it is, the 6015B-TV fitted with a pair of X5460s is quick and, for a rack server, very frugal. Populated with just the one Raptor hard disk for the purposes of testing, it consumed only 185W at idle. This rose dramatically to 300W when the system was working hard, but bear in mind that’s for eight cores of processing power, 24MB of cache, and not far off 2 billion transistors. Real-world total power consumption is likely to be higher, since the chassis supports four hot-swappable hard drives, but total power consumption should still be below 400W. It’s a hell of a lot of computing performance for no more power than a desktop PC with a high-end graphics setup will consume.

One of the criticisms levelled by AMD at the fully buffered FB-DIMM memory required for the 5000P chipset is that its power consumption is unacceptably high, but any hit that the overall power consumption levels take from FB-DIMMs is more than made up for by the performance of the Xeons. In other words, performance per watt is massive even if the memory subsystem isn’t the ideal solution when it comes to absolute levels of energy consumption.

Performance
Our standard benchmark suite isn’t suitable for testing Xeon performance: the basic server graphics chipset in our test chassis means that benchmarks which intensively update the frame buffer will artificially suffer and skew the results.

Instead, we ran a subset of the tests, and for comparison – since quad-core Opteron processors are AWOL – looked at the scores relative to Phenom since it’s based on the same Barcelona architecture.

In CPU-intensive workloads such as rendering, the performance is on a similar level to that of the QX9650. But, of course, there are eight cores in total. In threaded applications, that can mean a near-doubling of computing performance. If, when they arrive, quad-core Opterons have performance on a par with Phenom, AMD is going to have a lot of catching up to do.

Xeon Benchmarks
click to view full size image
Click to enlarge


Next page - AMD vs Intel: The Verdict (click below)...

AMD vs Intel: the verdict
We’d like to say that Intel’s dominance is coming under threat with the release of AMD’s new generation of chips, but that clearly isn’t the case right now. AMD’s newest parts look like old technology in the face of the competition. With 32nm production in Intel’s SRAM memory chips already well up to speed, and bearing in mind its manufacturing capability – it has 15 fabrication plants worldwide to AMD’s two – Intel’s astonishing turnaround after the debacle of Pentium 4 shows no signs of slowing.

That’s not to say AMD is dead in the water. Back in 2004, the whole industry was pouring scorn on Intel and no-one could see how the performance lead opened up by Athlon 64 could be beaten. Intel’s belated but dramatic response was to abandon the power-hungry dinosaur of Pentium 4 and its NetBurst architecture, and go all-out to design something new and better. AMD’s problem is more serious – all sectors of its business were loss-making in the second half of 2007, raising the spectre of cuts in R&D when exactly the opposite is needed if the company is to blossom once more.

On the server front, it’s looking even worse. Our industry sources – even those traditionally very loyal to AMD – are furious over the broken promises regarding Barcelona. Vendors that have remained loyal to AMD and geared up to supply Barcelona servers have been left with no processors to put in them. When quad-core Opteron does finally make it to market in volume, it may be to a market that’s finally bitten the bullet and moved over to Intel.

The good news is that, after the debacle and public shaming of AMD over Barcelona, the company clearly doesn’t want anything similar to happen again. One of the key things it needs is more manufacturing clout, and with AMD teaming up with other players in the semiconductor industry to develop 32nm parts, the transition to 32nm could get a boost. Like Intel, AMD’s future plans revolve to a large extent around multicored heterogeneous processors – in other words, processors with multiple cores specifically tailored for different tasks. This will be completely new ground and an opportunity for good basic engineering to trump the ultra-hi-tech fabrication that, for the moment, AMD simply can’t match.

For now, though, it’s pretty much a whitewash. Intel’s dominance is total in every area: value for money, performance per watt and all-out speed.

This article appeared in the March, 2008 issue of PC Authority.