AGP4X Fast Writes
Fast Writes?
Nvidia's press release states that their new GPU has support for "AGP 4X with Fast Writes - a unique feature in GeForce 256." Bitboys had announced almost a month ago that AGP fast writes will be "fully supported" by the
Glaze3D, but Nvidia is the first to implement the technology.
From the technical brief, "Fast Writes improves all writes from the CPU to the graphics chip including:
- All 2D operations
- Operations involving writing to the frame buffer or sending any data to the graphics chip.
- Loading textures in Direct3D into local memory.
- Writing push buffers to graphics local memory this is where most of the performance boost is generated.
Fast writes allow the CPU to send data directly to the graphics system without having to go through system memory. Current systems don't have a need for AGP fast writes because there's enough memory bandwidth to support 2 or 3 million sustained triangles per second, but future systems will need to support upwards of 10 to 15 million triangles per second.
In Nvidia's technical brief, they compare the memory bandwidth needs for 2 million triangles per second on" today's system," an Intel BX chipset with 100MHz system memory, with 10 million triangles per second on a "next-generation system," an Intel 820 chipset with DRDRAM and a 133MHz FSB.
![Nvidia GeForce 256 at IDF [ Standard bandwidth @ 640 x 354 ] > View Full-Size in another window.](images/fastwrites0-s.jpg) Standard bandwidth
|
|
![Nvidia GeForce 256 at IDF [ AGP4X w/o Fast Writes @ 640 x 311 ] > View Full-Size in another window.](images/fastwrites1-s.jpg) AGP4X w/o Fast Writes
|
|
System Limitations
As you can see, the 2 million tri/s rate only needs 180MB of memory bandwidth between the CPU and the chipset and the chipset and the graphics bus assuming each triangle is only 90 bytes. It needs double the bandwidth between system memory and the chipset because data goes both ways. Even then, there's more than enough bandwidth to handle the data transfer rates.
Up the number of triangles up to 10 million tri/s, and you start seeing internal bandwidth limitations on the next-generation system. The 1.06GB/s memory bus from the CPU to the chipset and from the chipset to the graphics bus can still handle the hefty 900MB/s data transfer rate. The 1.6GB/s system memory, however, cannot handle 900MB/s transfer rates going in and out without bogging down the entire triangle pipeline.
![Nvidia GeForce 256 at IDF [ AGP4X with Fast Writes @ 640 x 342 ] > View Full-Size in another window.](images/fastwrites2-s.jpg) AGP4X with Fast Writes
|
As mentioned earlier, fast writes allow the CPU to send data directly to the graphics system without going through system memory. Taking the system memory out of the pipeline removes the bottleneck, and frees system memory to perform other functions.