Inside The G92 Graphics Core
65-nm process inside
G92 is built on TSMC’s smaller 65-nm manufacturing process. Previous NVIDIA GPUs, including the G84 chip used in the 8600 GT/GTS utilized TSMC’s 80-nm or 90-nm process. The smaller process allows NVIDIA to cram more transistors into G92 (approximately 754 million in G92 versus 681 million in G80) without severely increasing the size of the GPU’s die. This makes the GPU cheaper for NVIDIA to produce as a result – the 8800 GT will sell for an MSRP starting at $199 so this is important.
Another benefit of the smaller process is reduced power consumption. With lower power consumption, the chip generates less heat. This allows NVIDIA to cool the chip with just a single-slot heatsink/fan unit, versus the dual-slot cooling NVIDIA used previously for the GeForce 8800 GTS line.
112 stream processors running at 1.5GHz
If you recall the architecture of the G80 GPU used in the GeForce 8800 GTX, NVIDIA arranged the stream processors into groups of sixteen. Each group of stream processors had its own dedicated texture address and filtering units as well as L1 cache. For ease of use we’ll refer to each of these groups of sixteen as a “bank” of stream processors. With 16 stream processors per bank, and 8 banks total, that adds up to a total of 128 stream processors inside the GeForce 8800 GTX, while two banks were deactivated in GeForce 8800 GTS for a grand total of 96 stream processors in the GeForce 8800 GTS. (As a side note, NVIDIA also disabled one ROP in GeForce 8800 GTS.) The following block diagram illustrates this nicely:
For GeForce 8800 GT NVIDIA employs a similar layout, although in this case instead of leaving six banks of stream processors active and deactivating two as in the case of the GeForce 8800 GTS, only one bank is deactivated, leaving a total of 112 functional stream processors:
One additional tweak NVIDIA has integrated into the stream processors that you can’t see very well in the above block diagram for G92 is the addition of four additional texture address units, bringing the total number of texture address units to eight. G80 was previously limited to just four texture address units. With the additional texture address units, G92 can address 8 textures and
perform 8 texture filtering ops/clock, previously it was 4 and 8 respectively. As a result of this addition, the GeForce 8800 GT can address 56 textures and perform 56 texture filtering ops/clock (versus 32 and 64 respectively in the 8800 GTX and 24 and 48 in the 8800 GTS).
This change has a profound affect on the GeForce 8800 GT’s texture fill rate; G92 can filter 33.6GTexels/sec while the GeForce 8800 GTS can filter just 24GTexels/sec peak.
Comparing the block diagram of the GeForce 8800 GT with that of the 8800 GTX you can also see that two ROP partitions are no longer present, leaving just four total. To help offset this, NVIDIA has come up with more efficient color and z-compression for G92. This enhanced compression should help at high resolutions, particularly once AA is applied, as available memory is used more efficiently, helping to keep memory usage in check.
The GeForce 8800 GT is also the world’s first PCI Express 2.0 graphics card. PCIe 2.0 offers double the bandwidth of PCIe 1.1; 8.0GB/sec in each direction, providing a total of 16GB/sec of total memory bandwidth. Of course, we still haven’t come close to saturating the bandwidth offered by PCIe 1.1, but presumably this could come in handy for lower-end value cards with less onboard memory.
256-bit GDDR3 memory interface
Yep, you read that right -- the GeForce 8800 GT reverts back to utilizing a 256-bit memory interface. To help offset this NVIDIA has cranked up the memory speed to 900MHz – the same speed as the 8800 GTX – but the 8800 GT still gives up nearly 7GB/sec of memory bandwidth to the GeForce 8800 GTS (64GB/sec versus 57.6GB/sec). In theory, this could come back to haunt the GeForce 8800 GT at high resolutions with AA, but we’ll just have to wait and see how it all plays out in the benchmarks later in this article…
Right now you’re probably wondering why the GeForce 8800 GT’s G92 GPU contains more transistors if it has fewer stream processors aren’t you? We’ve been told that the bulk of the new transistors come from the display portion of the chip. If you recall G80, NVIDIA previously used an external display chip for input and output. This chip has now been integrated directly into the G92 GPU and continues to support two dual-link displays with HDCP. NVIDIA has also taken the VP2 video processor used in the GeForce 8500/8600 and integrated it into G92.
Additional transistors were also required for PCIe 2.0, integrated HDMI support, and the additional texture address units we mentioned earlier. It’s important to note that while GeForce 8800 GT offers native support of HDMI, it’s up to NVIDIA’s board partners to actually utilize this feature. We wouldn’t be surprised to see some of NVIDIA’s board partners provide a line of HDMI Edition 8800 GT cards separate from the standard gaming boards. In the past ASUS, Gigabyte, MSI, and XFX have really been pushing their line of silent cards for the HTPC crowd.