Two terms that are thrown around a lot in the world of PC components are 'evolutionary' and 'revolutionary'. The hoo-ha and hyperbole surrounding the recent release of Intel's new 915 and 925 chipsets was no great exception, with Intel vainly stressing the huge impact its new products would have on the burgeoning digital home market. The two technologies at the centre of this future vision were PCI-Express and DDR2, although it's interesting to note that while PCI-Express comfortably falls into the revolutionary category, DDR2 is more on the evolutionary side of the fence.
Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM) has been increasing in speed steadily since its introduction to PCs with DDR-266. However, it's starting to reach its limits as a technology at its current levels, which is due to signalling, power consumption and associated heat dissipation issues. This means that manufacturers have a harder time getting high efficiency and yields, and also makes it hard to increase frequency down the track.
This is where DDR2 steps in. Joint Electronic Device Engineering Council (JEDEC) - which is populated by most of the semiconductor companies that produce memory - began working on DDR2 quite a while ago with a vision towards fixing some of the fundamental issues that are preventing DDR from increasing in speed into the future. As such, DDR2 is not so much a brand new technology as an evolution of conventional DDR, which in itself is built on single data rate SDRAM technology.
Chip off the old block
Right off the bat, it's worth stressing that DDR2 isn't twice as fast as DDR at the same clock speed. In fact, it has exactly the same theoretical bandwidth, and even worse latency - making it marginally slower in practise. So if that's the case, what's the big deal? In order to understand what DDR2 is, and what it represents for the future of PCs, it'll help to delve inside DDR to see how it works.
Before DDR, we had conventional single data rate SDRAM. A stick of SDRAM sports a number of memory chips along with an I/O chip that reads from them and sends that data on to the front side bus. With PC133 SDRAM, each memory chip ran at 133MHz, and each clock cycle (so every 7.5ns) it could deliver a bit of data to the I/O chip - this is called 1n-prefetch, as 1-bit was fetched at a time. The I/O chip also ran at 133MHz, so each cycle it could take that bit and send it on to the front side bus. As it was single data rate, it sent the bit on the rising edge of the signal, but did nothing on the falling edge. This all happened with 64-bits at a time being accessed, so PC133 SDRAM had a total bandwidth of 1064MB/s.
With DDR, it was decided to make the most of that unused falling edge of the signal on the front side bus. With DDR-266 each memory chip, which still ran at 133MHz, sent two bits of data to the I/O chip each clock cycle, and this is called 2n-prefetch. The I/O controller, also at 133MHz, then spent a single clock cycle sending one of those bits to the front side bus on the rising edge of the signal, and one bit on the falling edge. This effectively doubled the output of the memory, and gave DDR-266 a bandwidth of 2128MB/s, yet internally it's running at the same speed as PC133 SDRAM.
The problem is, as speeds increase, and the time between each clock cycle shrinks, it becomes significantly harder to maintain a clean signal. High speed memory chips also require more power to run, and generate more heat. It's really these issues that DDR2 is targeting with the vision of enabling higher frequencies in the future.
The most significant feature of DDR2 is that, unlike DDR, the memory chips run at half the speed of the I/O chip. So, for DDR2-400, the I/O chip runs at 200MHz, but the memory chips run at only 100MHz. In order to deliver the same amount of data with the lower speed, the memory chips use 4n-prefetch, so every clock cycle sees 4-bits sent to the I/O chip. As the I/O chip runs at twice the speed of the memory, it has two clock cycles to send these 4-bits on to the front side bus using the same double data rate techniques as DDR. As such, DDR2-400 has exactly the same bandwidth of 3.2GB/s as does conventional DDR-400.
This is significant because it means the memory chips in DDR2 are clocked lower than conventional DDR. As such, DDR2 has further headroom to be ramped up in frequency over time compared to DDR, which is beginning to reach its peak. The other important factor is it runs at only 1.8V, which is lower than the 2.5V of DDR. Lower power consumption also leads to lower power dissipation and heat generated, which also helps when it comes to pumping up the speed.
DDR2 also sports another couple of features that improve signalling. The first is off-chip driver calibration, which monitors the signal, and adjusts it to optimum levels. The second is on-die termination, which helps reduce interference and noise. Normal memory is terminated on the motherboard, which means there's a higher incidence of signals bouncing around the memory stick and interfering with legitimate signals. By placing the terminating resistor on each individual memory chip, it means the reflected signals don't seep into the memory stick as a whole. If you know your way around a Dick Smith electronic kit, though, you'll see a potential drawback to this approach: resistors generate heat, and that heat is now located next to the memory chip itself, which makes things like heat spreaders even more important to avoid overheating.
The last major feature of DDR2 is additive latency, which improves the way commands are handled internally in the memory, thus improving sustained data transfer and bandwidth. Essentially, additive latency prevents two commands from multiple banks conflicting with each other. This phenomenon can cause 'bubbles', or breaks in data flow, in conventional RAM.
This happens because a RAM chip can only handle one external command, such as ACTIVATE or READ, at a time. If the RAM is reading from multiple banks at the same time via bank interleave (a common feature on modern RAM), latencies can mean that one bank is receiving a READ command at the same time as another receives an ACTIVATE command. Because the READ command from the first bank corresponds to an earlier ACTIVATE command, it takes precedence, and the second bank's ACTIVATE command gets pushed back one cycle. This results in the banks being out of synch further down the track when they actually send their data to the I/O chip.
Additive latency fixes this by bringing some external commands, like READ, inside the chip, and making sure there are no two external commands being executed at the same time. So, long story short, it improves sustained memory throughput, and therefore bandwidth.