Summary: With 240 stream processors, a 512-bit memory interface, and 1.4 billion transistors, NVIDIA's GeForce GTX 280 is built for the enthusiast who craves maximum performance. Just how fast is it? Find out inside!
The high-end graphics market, once known for its brutal competition, became brutally boring. Despite its jaw-dropping specs, AMD’s R600 GPU never lived up to its potential, instead it competed one notch below the GeForce 8800 GTX and Ultra in the sub-$400 performance graphics segment. This left the enthusiast space all alone to NVIDIA, and they summarily did nothing, shoring up their position in the mainstream market instead with the launch of the GeForce 8800 GTS 512MB, 8800 GT, and 9600 GT. These GPUs were all terrific upgrades for the price segments they competed in, but enthusiasts looking for a GPU that would truly displace the GeForce 8800 GTX or Ultra were left out in the cold.
Then came the GeForce 9800s.
As their name implied, these GPUs were supposed to be the answer that so many enthusiasts wanted, but they each had their share of flaws.
The GeForce 9800 GTX was a complete miss. Overall the 9800 GTX was slightly slower than the 8800 GTX, much less an 8800 Ultra in 3D gaming performance (the only exception to this rule was Company of Heroes). Fortunately the 9800 GTX delivered lower power consumption than the 8800 GTX/Ultra, and its VP2 video processor delivered better video playback; but this card wasn’t an upgrade if you already owned an 8800 GTX or Ultra.
The 9800 GX2 was a little better – it actually outperformed a single GeForce 8800 Ultra by a factor of 1.5X – but it needs two GPUs to accomplish this. Two GeForce 8800 GTS 512MB cards combined together for SLI are actually capable of outrunning the 9800 GX2, and are also priced lower than the GX2 as well. The GeForce 9800 GX2 is a fine card, especially if your motherboard only has one PCI Express graphics slot, but it just wasn’t the head turner that the GeForce 8800 GTX was when it launched.
This brings us to today’s launch of the GeForce GTX 200 family. The GTX 200 is NVIDIA’s second generation unified shading architecture, building on the foundation first established with the GeForce 8800 GTX in November 2006. NVIDIA has incorporated a number of tweaks designed to improve performance over the GeForce 8800 GTX. Here it’s important to stress again that the bar NVIDIA was shooting for was the GeForce 8800 GTX and not the 9800 GX2. Work on this GPU began years before the 9800 GX2 came to market. In fact, NVIDIA’s design goal was to deliver up to twice the performance of the 8800 GTX.
But did they deliver? That’s what we’re here to find out!
Second Generation NVIDIA Unified Architecture:
Second-generation architecture delivers 50% more gaming performance over the first generation through 240 enhanced processing cores that provide incredible shading horsepower.
GeForce GPU support for NVIDIA PhysX technology enables a totally new class of physical gaming interaction for a more dynamic and realistic experience with GeForce.
NVIDIA SLI and 3-way SLI Technology:
Industry-leading, 3-way SLI technology offers amazing performance scaling by implementing 3-way alternate frame rendering (AFR) for the world’s fastest gaming solution under Windows Vista.
Microsoft DirectX 10 Support:
DirectX 10 with full Shader Model 4.0 support delivers unparalleled levels of graphics realism and film-quality effects for today’s hottest games.
NVIDIA CUDA Technology:
CUDA technology unlocks the power of the GPU’s processing cores to accelerate the most demanding system tasks—such as video transcoding—delivering up to 18× the performance of traditional CPUs.
PCI Express 2.0 Support:
Designed for the PCI Express 2.0 bus architecture offering the highest data transfer speeds for the most bandwidth-hungry games and 3D applications, while maintaining backwards compatibility with existing PCI Express motherboards for the broadest support.
Massively multi-threaded architecture supports thousands of independent, simultaneous threads, providing extreme processing efficiency in advanced, next-generation shader programs.
NVIDIA Lumenex Engine:
Delivers stunning image quality and floating-point accuracy at ultra-fast frame rates.
16× Antialiasing Technology:
Lightning fast, high-quality antialiasing at up to 16× sample rates obliterates jagged edges.
128-bit Floating Point High Dynamic-Range (HDR) Lighting:
Twice the precision of prior generations for incredibly realistic lighting effects—now with support for anti-aliasing.
OpenGL 2.1 Optimization and Support:
Provides top-notch compatibility and performance for OpenGL applications.
Dual Dual-link DVI Support:
Able to drive the industry’s largest and highest resolution flat-panel displays up to 2560 x 1600 and with support for High-bandwidth Digital Content Protection (HDCP).
NVIDIA PureVideo HD Technology:
The combination of high-definition video decode acceleration and post-processing that delivers unprecedented picture clarity, smooth video, accurate color, and precise image scaling for movies and video.
Discrete, Programmable Video Processor:
NVIDIA PureVideo is a discrete programmable processing core in NVIDIA GPUs that provides superb picture quality and ultra-smooth movies with 100% offload of H.264 video decoding from the CPU and significantly reduced power consumption.
Dual-Stream Hardware Acceleration:
Supports picture-in-picture content for the ultimate interactive Blu-ray movie experience.
Dynamic Contrast Enhancement & Color Stretch:
Dynamically provides post-processing and optimization of high definition movies for spectacular picture clarity.
NVIDIA HybridPower Technology:
Lets you switch from the GeForce GTX 280/260 graphics card to the motherboard GeForce GPU when running non graphically-intensive applications for a quiet, low-power, PC experience.
NVIDIA has incorporated a number of improvements into the architecture of their GeForce GTX 200 GPUs. No doubt the most obvious addition that most gamers will notice are the additional stream processors, up from 128 in GeForce 8800 GTX, to 240 in NVIDIA’s flagship GeForce GTX 280. But NVIDIA has integrated a number of other architectural enhancements that aren’t as easy to quantify, such as improved geometry shading and stream out performance, larger register file sizes, and improved texturing performance. GeForce GTX 200 GPUs also support over twice the number of threads in flight when compared to GeForce 8800 and boast more efficient instruction scheduling and instruction issue than 8800. The GTX 200 is also NVIDIA’s first GPU to support double-precision floating point. We’ll go over all these changes in a little more detail on the next page.
As shader-intensive, DirectX 10 games become more pervasive, the need for shading horsepower becomes more paramount. To accomplish this task, NVIDIA has increased the number of stream processors from 128 in G80 to 240 in the GeForce GTX 280, while the GeForce GTX 260 has 192. The following is a block diagram of G80 followed by the GeForce GTX 280:
Each of the light green squares in the above diagram is a stream processor. If you’re patient enough to count every one of them, you’ll notice there are 240 stream processors total. The stream processors are organized into groups of streaming multiprocessors. Each streaming multiprocessor consists of eight individual stream processors. These streaming multiprocessors are then clustered in groups of three, with three streaming multiprocessors going into one texture processing cluster. There are ten texture processing clusters inside GeForce GTX 280. When compared to G80, GeForce GTX 280 has two additional thread processing clusters (10 versus 8), and each thread processing cluster has 3 streaming multiprocessors in GeForce GTX 280 versus two streaming multiprocessors in G80. Like G80, the stream processors run at their own clocks that are independent of the rest of the graphics core. In the GeForce GTX 280 for instance, the stream processors run at 1.296GHz, while the rest of the GPU runs at 602MHz.
GeForce GTX 280 also boasts improved threading. Whereas GeForce 8800 GTX was limited to a maximum number of 12,288 threads, GeForce GTX 280 supports a maximum of 30,720 concurrent threads in hardware. The thread scheduler dynamically load balances and is highly efficient, if a particular thread becomes stalled waiting for data, the GPU can immediately switch to another thread to process with no overhead.
With games increasingly using longer, more complex shaders NVIDIA has doubled the amount of register space in GeForce GTX 200. According to NVIDIA, GeForce 8 and 9 series GPUs were beginning to run into situations where these complex shaders would exhaust the registers, requiring the GPU to swap to memory. By doubling the size of the register file, these shaders can be run without having to do this, improving performance.
One tweak NVIDIA integrated into G92 versus G80 was the addition of four additional texture address units. This allowed G92 to address 8 textures and perform 8 texture filtering ops/clock, previously it was 4 and 8 respectively. The end result was that GeForce 9800 GTX could address and filter 64 pixels per clock, whereas GeForce 8800 GTX was limited to 64 pixels per clock of texture filtering, and 32 pixels per clock of texture addressing.
Double-Precision Floating Point
As the GPU moves beyond its traditional 2D/3D/gaming workload to performing computationally-intensive scientific and financial computing functions, it’s very important that the GPU is capable of producing very accurate results. To achieve this NVIDIA has added double-precision floating point support to the GeForce GTX 200 series. Each streaming multiprocessor has its own double-precision, 64-bit floating point math unit, for a grand total of 30 FPUs on the GPU.
512-bit memory interface
The GeForce GTX 280 is NVIDIA’s first GPU to sport a 512-bit memory interface. In particular, eight 64-bit memory controllers are used. With a wider, 512-bit path to memory, memory bandwidth is double that of GeForce 9800 GTX at equal memory speeds, but NVIDIA clocks the GTX 280’s memory slightly higher than the 9800 GTX, running at 1,107MHz. This equates to 141.7GB/sec of peak memory bandwidth. NVIDIA continues to use GDDR3 memory due to its lower latency, and the GTX 280 is outfitted with 1GB onboard.
Like the G80 launch, NVIDIA is providing two GeForce GTX 200 SKUs for today’s launch: the flagship GeForce GTX 280, and its feature-reduced sibling, the GeForce GTX 260. Both SKUs are built off the same GPU, only NVIDIA disables two thread processing clusters on the GeForce GTX 260, leaving just 192 active stream processors. As we mentioned on the previous page, the GTX 260 also has a narrower memory interface. NVIDIA also disables one ROP.
The following chart sums up the differences between the GTX 280 and the GTX 260:
As you can see, the clock speeds are slightly slower for the GeForce GTX 260 as well. While the GTX 280 features a 602MHz core with its shaders running at nearly 1.3GHz and 1.1GHz memory, the graphics core on the GTX 260 runs at 576MHz, with 1.24GHz shaders and 1.0GHz memory. And of course, you’ll no doubt notice the difference in price: $400 versus $650.
The first GeForce GTX 280 boards will be hitting retail starting tomorrow, while we’ve been told that GeForce GTX 260 cards won’t be available until June 26th.
With 1.4 billion transistors inside, power management is crucial for the GTX 200 to remain viable inside today’s PCs. To achieve this, NVIDIA employs a number of power-saving techniques, including clock-gating which shuts down parts of the GPU that aren’t being used, and dynamic clock speed/voltage adjustment.
With NVDIA’s acquisition of PhysX, the two companies have been hard at work porting the PhysX API using NVIDIA’s CUDA SDK. As a result of this effort, any CUDA-capable GPU (this includes every GeForce 8/9/GTX 200 graphics card) is compatible with AGEIA PhysX. We’ve been told that NVIDIA will have a beta PhysX driver for press evaluation sometime later this week, so hopefully we’ll be able to check out GPU-based physics for real very shortly.
Starting with GeForce 8 (and AMD’s own Radeon 2K series) the GPU has evolved from being used solely as a graphics processor to a more diverse general purpose processor that can be used for a variety of tasks that were traditionally handled by the CPU. What kind of applications are on the horizon for the GPU? How does audio and video encoding sound?
And as anyone who owns a GeForce 8800 GTX or Ultra can tell you, the bottom of the graphics card can get quite hot once it’s under full load. By placing a large aluminum plate on the bottom of the card, heat on the underside of the card is dramatically reduced, helping to keep overall board temps down as heat is dispersed across the entire thermal plate.
The upper portion of the cooler is largely composed of plastic. This plastic duct channels air from the card’s fan across the GPU and its accompanying heatsink. The air is then exhausted out the back of your system case.
Cooling the GPU itself are copper heatpipes. These heatpipes are responsible for drawing heat off the graphics processor. This heat is then transferred to a large, dual-slot aluminum heatsink. A blower-style fan supplies fresh, cool air to the entire system. The fan spins at variable RPMs that adjust based on temperature. Like previous GeForce coolers we’ve tested over the years, the fan runs fairly quiet, even during extended gaming sessions with the card overclocked. The loudest fan inside our testbed was actually the smallest: the chipset fan sitting on the nForce 790i’s North Bridge.
Both cards sport the same cooling and are outfitted with the same back plate design. On the back plate you’ll find two HDCP-enabled dual-link DVIs, and a 7-pin video-out port that supports S-Video as well as component video output (via dongle). Like the 9800 GX2, NVIDIA also places an LED on the back plate of the card for troubleshooting problems during installation. If the LED shines red, you know you haven’t hooked up the board’s power correctly, green means everything is good to go.
Since both cards look so similar, spotting differences between the two is tough at first glance. The only real difference lies in the power connectors. Whereas the GeForce GTX 260 has two 6-pin PCI connectors, the GTX 280 has one 6-pin connector and one 8-pin connector. Both connectors must be properly connected in order for the GTX 280 to operate properly. You’ll also notice the SPDIF connector is located next to the power connections. This is required for sending audio over HDMI. Simply plug the cable provided with your GTX 200 graphics card into your sound card to get audio.
So what are our first impressions of the GTX 280 and GTX 260 boards? At 10.5” long, the boards are similar in size to NVIDIA’s 9800 GX2 and 9800 GTX. Measuring in just a hair over half a millimeter in thickness, we don’t think the aluminum plate on the underside of the board will interfere with most of today’s motherboards, although there may be a few exceptions to this. NVIDIA has fixed the problems we discussed with the 9800 GX2’s power sockets. The power sockets for the 8-pin and 6-pin PCIe connectors are large enough to accommodate every power supply we’ve seen.
One disturbing trait that we did notice though is heat. These GPUs appear to be quite sensitive to it. If you don’t keep the system adequately cooled and the card begins to heat up, performance suffers dramatically. And we’re not talking excessively high temps either. The GPU will begin to throttle itself if it hits 105 degrees; we never saw temps anywhere near this during our testing, even when running 3 GTX 280 cards in 3-Way SLI, but we noticed frame rates suffered dramatically at times anyway if we didn’t keep system temps down to an absolute minimum. We run all of our testbeds in an open air environment, so we added a floor-standing fan to help keep the cards cool. When running 3-Way SLI, we also manually cranked up the fan speeds to 100% for added measure.
Once we instituted these changes, the slowdowns largely went away. The bottom line: if you plan on running the GeForce GTX 280 or 260 inside a case, we definitely recommend that your chassis has a fan on or near the GPU.
Now let’s take a look at the first retail GeForce GTX 280 cards we’ve received…
EVGA e-GeForce GTX 280 FTW
EVGA’s e-GeForce GTX 280 FTW is definitely engineered for the win. The card is overclocked from the factory, running at 670MHz on the graphics core, and the shaders at 1458MHz. These speeds are an improvement of 68MHz and 216MHz on the graphics core and stream processors respectively. The board’s memory is also overclocked, running at 1215MHz, a speed which is 108MHz higher than stock.
But if you want to take the board even further, EVGA now provides the tools for you to accomplish this. The e-GeForce GTX 280 and GTX 280 FTW boards are EVGA’s first cards to ship with their “Precision” overclocking utility.
Precision supports clock speed adjustment for the graphics core, memory, and stream processors. The stream processors can be set to run at a fixed ratio with the graphics core, or they can be unlinked to run at whatever speed you need. The utility also provides a slider for manual fan speed adjustment as well. If you don’t know what you’re doing, help menus are automatically provided with brief descriptions of what each setting does.
Overall Precision is a pretty slick utility and we think a lot of EVGA card owners are going to enjoy it.
The e-GeForce GTX 280 FTW carries an MSRP of $699, while NVIDIA’s bone stock GTX 280 retails for $649. The card ships with dual DVI connectors, a component video cable, EVGA case badge, and both 8-pin PCIe and 6-pin PCIe power adapter. This is actually the first graphics card we’ve seen to ship with an 8-pin PCIe adapter. Obviously this feature is useful if you don’t have a power supply with 8-pin power connections.
XFX GeForce GTX 280
In recent months, XFX’s cards have stood out due to their excellent game bundles. You see, XFX has been shipping their cards with Activision’s hit FPS, Call of Duty 4. CoD 4 has earned nothing but positive reviews, we gave it a 90% rating in our official review back in November.
Now with their GeForce GTX 280 board, XFX bundles a copy of Assassin’s Creed inside the box. Assassin’s Creed is a DX10 game set during the crusades. With its jaw-dropping graphics, the game is a perfect complement to the GeForce GTX 280.
The GeForce GTX 280 card we received from XFX ran at the stock GTX 280 clocks, but we have no doubt that XFX is also working on an OC’ed SKU for enthusiasts who want more performance than stock.
Intel Core 2 Extreme QX9770
EVGA nForce 790i Ultra SLI motherboard
2GB Crucial Ballistix 2.0GHz DDR3
GeForce 9800 GTX
GeForce 8800 GTX
GeForce 8800 GT 512MB
GeForce GTX 280
GeForce GTX 260
AMD Radeon HD 3870 X2
300GB Western Digital Caviar SE
Windows Vista Ultimate 64-bit w/Service Pack 1
Company of Heroes 1.71
Crysis High – Direct3D
Using NVIDIA’s latest system utility, we managed to hit some pretty impressive speeds when OC’ing the GTX 280 and GTX 260. Our GTX 280 topped out at 754MHz core, 1.32GHz memory, and 1559MHz for the stream processors!! Meanwhile we were able to hit speeds of 712MHz core, 1.151GHz memory, and 1.483GHz for the GTX 260’s shaders. What kind of performance improvement does that equate to in games? Have a look:
Does that lessen the impact of the GeForce GTX 280? For some gamers, particularly those who already own GeForce 8800 GTX cards and an SLI motherboard, the answer is probably yes: after all, adding a second GeForce 8800 GTX card buys you performance that will exceed the GTX 280. The GTX 280 will consume less power than a pair of GeForce 8800 GTX cards and it boasts better video, but at $650 it isn’t exactly an inexpensive upgrade.
Then there’s the GeForce GTX 260. Its closest equivalent is a pair of GeForce 8800 GTs running in SLI. Here it’s a neck-and-neck race, with the 8800 GT SLI combo winning some benchmarks, and the GTX 260 taking home the performance crown in others. Considering its slimmer $400 price tag, this card is the easier upgrade to stomach given its performance. In fact, it’s the board we’d buy if we were plunking down the cash for a graphics upgrade today. This board ran circles around the 9800 GTX in our testing, and only costs about $100 more. Look at it from another perspective, the GTX 260 gets you 80-85% of the performance of the flagship GTX 280 for $250 less. NVIDIA’s got to stick to that $400 MSRP though in order for this equation to work.
Looking at the paper specs of both GPUs though, we have a feeling some of you are probably unimpressed with the new GTX 200s. Keep in mind however that subsequent driver releases are bound to deliver more performance, the numbers we’re presenting today are only the baseline and should go up from here. Also keep in mind that this GPU has features that are supposed to make it more appealing for computing applications: NVIDIA didn’t add double-precision floating point units for games for instance.
It’s the computing applications that have us most excited about the potential for GeForce GTX 200, and the rest of NVIDIA’s GeForce 8/9 series GPUs as well for that matter. We’re still in awe of the Photoshop demonstration.
Another ray of hope that we see for the GeForce GTX 200s is overclocking. If you missed our OC’ing results, you may want to flip back to the previous page and check them out. We were able to hit some pretty impressive OCs with our cards. Clearly TSMC’s got their 65-nm process down, and we wouldn’t be surprised to see NVIDIA’s board partners come out with some highly exotic GTX 260 and 280 cards in the near future.
AMD will soon be responding with their own counter to NVIDIA in the form of their RV770 and R700 GPUs. RV770 will tackle the space occupied by the 9800 GTX and GTX 260, while AMD’s dual-GPU R700 is going after NVIDIA’s flagship GTX 280. It will be interesting to see how these chips compare to NVIDIA’s GeForce lineup, but one ace in the hole NVIDIA will certainly have over AMD for now at least will be CUDA and PhysX. Meanwhile, AMD of course has DirectX 10.1 going for them. Who comes out on top will likely come down to price, performance, and drivers, and we can’t wait to see how things shake out.
|© Copyright 2003 FS Media, Inc.|