In what looks like a first wave of chip details, Nvidia has been praising Fermi's GPU-compute capabilities.
Fermi has a number of computing features never before seen in a GPU, which according to Tech Report should enable new applications for GPU computing and create some new horizons for Nvidia's GeForce and Tesla products.
Fermi has 16 streaming multiprocessors and 512 discrete cores, more than double the number of CUDA cores of the GT200.
Nvidia's next generation GPU uses 64-bit interfaces and six DRAM interfaces, which means that Fermi has a total path to memory that is 384 bits wide. This is fewer than the GT200, but Fermi more than makes up for that by delivering nearly twice the bandwidth per pin via support for GDDR5 memory.
fermi-die-finalThere is not enough information available to tell if Fermi will give ATI's Cypress a kicking yet. So far Nvidia has not revealed the details about its graphics resources or what clock speed the GPU will achieve.
Tech Report seems to think that the GPU clock speed will be about 1500MHz, a reasonable frequency target for Fermi's stream processing core and about the same as that of the GeForce GTX 285. If we assume Fermi reaches that speed, its peak throughput for single-precision math would be 1536 GFLOPS, or about half of the peak single-precision floating point speed of the ATI Radeon HD 5870.
It also thinks that if Nvidia uses the same 4.8Gbps data rate for GDDR5 memory that AMD has for Cypress, Fermi's peak memory bandwidth should be 230GBps, roughly 50 per cent higher than that of the Radeon HD 5870, which has a memory bus width of 256 bits.