Summary: ATI's RADEON X800 series is designed from the ground up to deliver blistering performance. The graphics core sports 16 pixel pipelines and supports the latest GDDR3 memories. ATI has also cranked up the clocks to levels never before seen on a high-end DX9 part. But the X800 doesn't solely rely on performance, as the X800 supports a new AA mode and ATI's 3Dc compression technology, which will be featured in Valve's Half-Life 2 when it debuts. In today's article we go over all the new features, what aspects have changed, and which have not as well as dabble in a little bit of overclocking. See how ATI's latest boards run in this article!
[image]
ATI’s RADEON 9700 PRO revolutionized the 3D graphics market when it was introduced two summers ago. Not only was ATI first to market with a next generation DirectX 9 graphics card, their RADEON 9700 PRO offered performance that was unrivaled by anyone else on the market, with razor-sharp image quality. The only real negative with the board was its $400 price tag, but ask anyone who forked over the $400 a couple of years ago, chances are they’re pretty pleased with their purchase. The RADEON 9700 PRO has been holding up well in comparison to newer DX9 cards like the RADEON 9800 PRO and RADEON 9800 XT, not to mention NVIDIA’s GeForce FX 5800/5900 series. Now ATI’s task is to build something better. NVIDIA’s GeForce 6800 family, which was just announced two weeks ago, offers a significant performance improvement over previous high-end offerings thanks to its 16-pixel pipeline architecture, brand new shading engine, and high-speed GDDR3 memory with a 256-bit memory interface. If ATI were to answer with anything less, they’d lose valuable mindshare among enthusiasts. [image]
Remember, at the high-end mindshare can be just as important, if not more important, than volume, as having the positive press that comes with producing the top high-end card can carry over to your other lines in the mainstream and value segments. Mindshare after all, is largely what carried 3dfx in the post-Voodoo2 days. With this in mind, ATI’s east coast development team set out to create a successor worthy to the R300 core first introduced in the RADEON 9700 PRO. Rather than starting with a blank piece of paper however, ATI’s X800 engineering team decided to take the best aspects of R300, improve on them, and spice up the formula a bit by adding a few new features. ATI’s stated goal was to deliver twice the performance of their previous high-end VPU, all without drawing significantly more power, generating an excessive amount of heat, or drastically increasing production costs. This was accomplished by going bigger, ATI’s high-end RADEON X800 XT family sports twice as many pixel pipelines as RADEON 9700/RADEON 9800 at 16 pipes, while at the same time going smaller: like RADEON 9600 XT, the X800 series is built on TSMC’s advanced 0.13-micron manufacturing process with low-k dielectric material. Previous high-end architectures utilized TSMC’s battle-tested but larger 0.15-micron process. [image]
ATI refers to its focus with this generation as delivering high-definition gaming. Just as high definition televisions have brought a new level of quality to the TV, ATI plans to deliver more performance and realism to the PC through the use of its 3Dc compression for normal maps, SMARTSHADER HD, SMOOTHVISION HD, and HYPERZ HD technologies. But what do all these new buzzwords mean?
The list
SMARTSHADER™ HD Vertex programs up to 65,280 instructions with flow control Single cycle trigonometric operations (SIN & COS) Up to 1,536 instructions and 16textures per rendering pass 2nd generation F-buffer technology accelerates multi-pass pixel shader programs with unlimited instructions 32 temporary and constant registers Facing register for two-sided lighting 128-bit, 64-bit & 32-bit per pixel floating point color formats Multiple Render Target (MRT) support Complete feature set also supported in OpenGL® via extensions SMOOTHVISION™ HD Sparse multi-sample algorithm with gamma correction, programmable sample patterns, and centroid sampling Lossless Color Compression (up to6:1)at all resolutions, including widescreen HDTV resolutions Temporal Anti-Aliasing Up to 128-tap texture filtering Adaptive algorithm with bilinear (performance) and trilinear (quality) options 3Dc HYPER Z HD VIDEOSHADER HD Notes
As we stated at the outset, with X800 ATI has decided to refine the formula that made the RADEON 9700 and RADEON 9800 series so successful. By moving to 0.13-micron, they’re able to integrate 16 “extreme pipelines” into the flagship RADEON X800 XT Platinum Edition without producing a huge die. The new graphics core weighs in at 160 million transistors, which pales in comparison to NVIDIA’s GeForce 6800 at 222 million. The million dollar question many are asking right now are where do NVIDIA’s extra transistors come from?
SMARTSHADER HD
SMARTSHADER HD takes 2.0 pixel and vertex shading to the next level in RADEON X800 by shattering some of the limitations of their previous architectures, particularly on the pixel shader side. We’ll start with vertex shading first though.
If you recall, S3’s texture compression technology, S3TC, was used with great success in DirectX and OpenGL titles to achieve the look of very high resolution textures without significantly taking up a lot of space in graphics memory, and thus ensuring optimal performance, as the compressed textures didn’t impact memory bandwidth as much as traditional high resolution textures. S3TC was so successful that it was eventually rolled into DirectX. This was great back in the DX7 days, but with today’s programmable titles, a variety of material properties are now stored in textures (such as the surface of the material’s shininess or roughness), along with the traditional color information S3TC was so good at compressing. Unfortunately, DirectX texture compression is not as effective at compressing these material properties as it is at compressing color data, an alternative was sought by ATI’s engineers – 3Dc, which is designed to handle normal maps, and the situation we just described above: packing multiple pieces of data into a single compressed texture. 3Dc in normal maps
Normal maps are an extension of bump maps. Whereas bump maps were used to add bumpiness to otherwise flat surfaces, normal maps can contain more detailed surface information, allowing them to go much beyond more traditional bump maps.
Normal maps are becoming increasingly popular in today’s games. Crytek uses normal maps in Far Cry to create realistic characters and objects without resorting to using more polygons, which would negatively impact performance. If you recall ATI’s Ferrari demo from the RADEON 9700 PRO launch, normal maps were used to give the Ferrari F50 more detail. Unfortunately, DXTC and S3TC are ineffective at compressing normal maps. They introduce blocky artifacts on normal maps and have a hard time dealing with the small edges and detail normal maps are designed to show. As a result, developers have been forced to limit the number of normal maps they use, or rely on lower resolution maps instead. 3Dc solves this problem by providing up to 4:1 compression of normal maps without significantly reducing image quality. ATI’s technique is hardware accelerated, so the performance impact is minimal. The end result is that developers are able to use higher resolution, more detailed normal maps, or use them on multiple objects within a scene rather than just a handful without consuming all the memory on a graphics board or crippling its memory bandwidth (and thus negatively impacting performance). [image]
In addition, ATI claims its 3Dc technology can be easily implemented into existing titles that already support normal maps. The same compression tools used for DXTC and S3TC can be used for 3Dc with only a few minor modifications. They mention Croteam’s upcoming Serious Sam 2 title as the most recent example. In a matter of days, Croteam had Serious Sam 2 up and running with 3Dc. Already Croteam (Serious Sam 2) Irrational Games (Tribes Vengeance), Firaxis (Pirates), Digital Extremes (DarkSector), Valve (Half-Life 2), Ritual, and Pseudo Interactive have signed on to use 3Dc in their upcoming titles. And in case you were wondering, 3Dc can be used on older RADEON hardware, but with slightly lesser quality. We’re not sure if ATI plans to roll 3Dc support into existing RADEON 9500 and up cards in a future driver update, it’s possible but ATI may want to leave this trump card for X800 and future mainstream and value products only. ATI is also working on getting 3Dc rolled into future versions of DirectX.
Anti-aliasing
The SMOOTHVISION HD implementation found in RADEON X800 builds largely on the anti-aliasing engine found in ATI’s existing DX9 products, which is currently regarded as the best-looking in the industry. Like previous designs, SMOOTHVISION HD’s anti-aliasing engine uses multi-sampling (particularly rotated-grid with gamma correction) with settings ranging from 2, 4, or 6 samples per pixel available. Temporal anti-aliasing
Today’s anti-aliasing implementations rely on predefined sample patterns to improve image quality. This has worked out well, but rather than using static sampling that has already been defined, what if the board could choose from one of a few patterns to achieve even better-looking quality? This is where temporal AA comes in.
The drawback to this new AA method is frame rate. If the frame rate drops too low, the eye can perceive the slight fluctuations in the colors of anti-aliased edges, this results in flickering artifacts that can be quite distracting. Another downside is that v-sync must be enabled when temporal AA is used; a lot of competitive gamers like to disable v-sync for maximum frame rate, the higher the frame rate the smoother controls feel (up to a point which will vary depending on the particular title) in shooters such as Quake 3 and UT. If the frame rate does drop below a certain level, it can be disabled by the driver until frame rates improve to an acceptable level. If this occurs, the graphics card would revert back to traditional multisampling AA. You can really see the difference temporal AA can make at settings of 85Hz and 100Hz, which are out of the reach of most LCD monitors and due to the nature of temporal AA, screenshots can’t accurately reproduce its effects, even providing video clips could be inaccurate. Therefore, the best way to see temporal AA is to sit in front of a graphics card that’s using it. Anisotropic filtering
ATI has fine tuned its anisotropic filtering engine slightly in X800 and SMOOTHVISION HD. Anisotropic filtering levels of 2, 4, 8, or 16 texture samples per pixel are still supported (with bilinear samples taken in performance mode, and trilinear samples in the default quality mode) although SMOOTHVISION HD uses an improved algorithm for improved anisotropic filtering performance without negatively impacting image quality.
As we’ve mentioned previously, ATI’s RADEON X800 XT Platinum Edition utilizes a 16-pixel pipeline architecture. This is twice the number of pixel pipelines found in RADEON 9800 XT and RADEON 9700 PRO. The X800 XT Platinum Edition is clocked at a core clock frequency of 520MHz, the highest of any other 16 pipeline architecture in the industry, including NVIDIA’s overclocked GeForce 6800 Ultra part. With its 16 pixel pipes and 520MHz core clock, the X800 XT Platinum Edition tops out at 8.3Gigatexels/second peak texel fill rate. This compares quite favorably to GeForce 6800 Ultra’s 6.4 Gigatexels/second and is over 2.5 times the previous high-end flagship, RADEON 9800 XT at 3.3Gigatexels/second. With 16 pipes, memory bandwidth is even more important. If the graphics core isn’t kept fed with data, all the extra pipes (and fill rate) is essentially wasted. [image]
To prevent this from occurring, the X800 XT Platinum Edition graphics core sits on a 256-bit wide interface to its memory operating at 560MHz (1.12GHz effective). Like core clock frequency, this figure is also higher than any other architecture on the market, providing up to 35.8GB/sec of peak memory bandwidth to the graphics core. ATI has partnered with Samsung, Micron, and other memory manufacturers to equip their X800 boards with GDDR3 memory, which is designed to operate at higher clock frequencies at lower power and heat levels than previous memory types. Our board shipped with modules from Samsung. The graphics core itself is produced on the same 0.13-micron manufacturing process with low-k black diamond dielectric material as ATI’s RADEON 9600 XT. According to ATI, this process allows roughly 33% more transistors per unit area than the 0.15-micron process used for RADEON 9700/9800. ATI also claims that the new process allows the transistors to run over 100MHz faster with no increase in power consumption or heat generation, allowing ATI to get by with the same cooling and power requirements as previous architectures. In fact, ATI claims that the X800 XT requires slightly less power than RADEON 9800 XT. Also remember that another key innovation used in TSMC’s 0.13-micron process is the use of copper interconnects. Aluminum interconnects were used at 0.15-micron, which has higher resistance than copper. This means that signals travels faster and with less heat than if aluminum had been used. With its 16 pixel pipes, X800 XT Platinum Edition is composed of 160 million transistors. In comparison, RADEON 9800 XT consists of 110 million while GeForce 6800 Ultra contains a whopping 222 million transistors! This means X800 XT Platinum Edition should be cheaper to produce than GeForce 6800 Ultra, assuming equal yields. X800 variants
In addition to the X800 XT Platinum Edition, ATI also offers the X800 PRO. The X800 PRO differs from the XT Platinum Edition in the number of pipelines (which has been reduced from 16 in the XT to 12 in the PRO) and the clocks, which are reduced to 475MHz core and 900MHz memory. Both boards sport 256-bit memory interfaces with 256MB of GDDR3 memory, and the same 160 million transistor count.
[image]
Looking at the X800 XT Platinum Edition and X800 PRO boards, you can see that they’re both physically indistinguishable from one another at first glance. The board layout of both cards is the same, with the only exception being the components used (the XT Platinum board ships with faster memory). Another aspect you’ll quickly notice is that both boards look awfully similar to the RADEON 9800 XT. You’ve got the same fire engine red PCB ATI has become well known for as well as the all-copper heatsink. Thankfully ATI has switched manufacturers for the X800 boards, resulting in a heatsink of much higher manufacturing quality. [image]
RADEON 9800 XT owners will also notice that the heatsink on the X800 boards is actually smaller than the cooler used on the RADEON 9800 XT. While the heatsink on the 9800 XT covers the graphics core and its accompanying memory modules, the GDDR3 modules found on the X800 boards aren’t covered by the copper heatsink. We found that the memory modules barely get warm to the touch, thus they don’t need the additional cooling. The same ducted design used on the RADEON 9800 XT is still in place on both X800 cards. Air enters through the card’s fan where it passes between the heatsinks fins before exiting out the right side and upper portion of the duct. The fan itself is large yet quiet. With many more fins present the fan doesn’t have to spin up to the high RPMs used by GeForce 6800 Ultra, resulting in a card that generates a lower noise level. You’ll also notice that the fan is located offset of the graphics core. [image]
As we mentioned in our RADEON 9800 XT article, this is to give the duct more room to work with, maximizing efficiency. Another added side affect is that the fan’s motor isn’t located directly above one of the hottest components on any graphics board: its core. This heat can prematurely kill the fan’s motor. Both boards run cool, even during extended operation with overclocking. We wouldn’t hesitate to put either board in a small form factor system. In fact, since the X800 boards require less power than previous high-end DX9 offerings, they’re even more ideal for this use than previous high-end cards! Looking at the board itself, the underside of both reference cards shipped with Rage Theater chips, which provides VIVO (video-in/video-out) support on both boards. We’ve been told that one of the Built By ATI boards will offer VIVO, but as of press time we don’t know if this is the RADEON X800 XT Platinum Edition or the X800 PRO. Most likely, we’d assume the X800 XT Platinum, as it’s the flagship board, but we weren’t able to get firm confirmation on this from ATI. [image]
Of course, ATI’s X800 board partners can implement this feature if they choose to. It’s obviously there on the reference board. The yellow connector you see on the end of the board is for analog video capture. This feature is seen on some graphics boards in Europe, where you’ll run a cable to a composite video connector on the front of your case, but won’t be offered on ATI’s own “Built By ATI” retail boards. Like video input, it’s possible one of ATI’s board partners in Europe may implement this feature on one of their X800 cards however. Thermal monitoring
Like the RADEON 9800 XT, both X800 boards ship with onboard thermal diodes, which has been integrated on the graphics core. The board’s fan does boot up with higher RPMs at first, but quickly settles down to normal after a second or two, suggesting that eventually the fan will operate dynamically based on temperature, just as the RADEON 9800 XT does. This feature isn’t enabled in the current beta driver however.
As usual we like to get the AA analysis started with a look at 3DMark 03, using the built-in image quality tests. This tool is great for analyzing IQ, as you can specify the frame and game test you’d like to look at. GeForce 6800 Ultra: [image]
RADEON X800 XT Platinum Edition: [image]
If you recall our GeForce 6800 Ultra preview article, NVIDIA has implemented a new rotated-grid AA sample pattern in NV40. The end result is much better than AA than previous GeForce FX and GeForce4 boards. As a result, picking an AA winner is much tougher than it has been in the past. In our 6800 Ultra preview we gave a slight nod to the ATI architecture, but looking at the results today we feel we may have been a little hasty. In the previous article we judged AA by looking at the trailing B-17 in the scene, flying just above the clouds. The jaggies are just a little more pronounced on the NVIDIA card than ATI’s on this aircraft (especially if you look at the tail and end of the left wing, just past the last engine). But if you focus instead on the lead B-17 in the formation at the very front of the scene, you’ll notice that the leading edge on its left wing is just a little sharper on the NVIDIA GeForce 6800 Ultra card. Focus on the area between its two engines. ![]() GeForce 6800 Ultra ![]() RADEON X800 XT Platinum However, if you gaze a little to the right of this point, you’ll notice that the fuselage on the X800 looks just a little bit smoother on the X800 XT screenshot in comparison to GeForce 6800 Ultra. ![]() GeForce 6800 Ultra ![]() RADEON X800 XT Platinum Lock On
RADEON X800 XT Platinum Edition: [image]
Picking out a winner here is again going to depend on the area you choose to highlight. For instance, the nose of the A-10 looks a little bit smoother on the GeForce 6800 Ultra card: ![]() GeForce 6800 Ultra ![]() RADEON X800 XT Platinum But if you look at the trailing edge of the A-10’s right wing (where the wing meets the fuselage), or the shadow underneath the right vertical stabilizer, the IQ edge goes to ATI: ![]() GeForce 6800 Ultra ![]() RADEON X800 XT Platinum Now let’s go back to the scene we used in the previous article: GeForce 6800 Ultra: [image]
RADEON X800 XT Platinum Edition: [image]
Going back to this angle, the area that we obviously missed in our original GeForce 6800 Ultra preview article is the runway itself. Note how much smoother the lines are on the GeForce 6800 Ultra board, it’s literally a night and day difference in IQ: ![]() GeForce 6800 Ultra ![]() RADEON X800 XT Platinum Before we get started on our AF quality testing with Lock On: Modern Air Combat, we wanted to reexamine the image quality issues we noticed with this title and GeForce FX/6800. Check out the tail on this F-15C: [image]
Notice the swirly marks on the tail when using GeForce cards? We don’t see this banding on the RADEON X800 board. Hopefully NVIDIA can address this bug with LOMAC in an upcoming driver release. Again, overall it’s much harder to make a final call on AA quality than it was in the past. We will however give ATI the overall AA edge thanks to their new temporal AA mode.
We’ll get our AF image quality testing started by booting up the same F-15C scramble mission from LOMAC, but this time taking a look at another F-15C parked on the tarmac (all AF screenshots were taken with 2xAA, just in case you’d like to compare both cards 2xAA modes): GeForce 6800 Ultra: [image]
RADEON X800 XT Platinum Edition: [image]
Determining which card boasts superior AF quality is difficult to determine based on this screenshot. While there are differences between the output of both screenshots, picking the superior image is a highly subjective call. We’re going to go back to scene we used in the GeForce 6800 article. GeForce 6800 Ultra: [image]
RADEON X800 XT Platinum Edition: [image]
We’ve still got to give the richer grass textures to NVIDIA in this comparison: ![]() GeForce 6800 Ultra ![]() RADEON X800 XT Platinum Far Cry
In our GeForce 6800 Ultra preview, we provided an AF screenshot that included a mountain in the background to test AF quality. In the image you could see an area of the mountain that wasn’t properly applying AF on the ATI card due to the angle of the mountain’s slope. This screenshot highlighted one of the chief criticisms of ATI’s AF implementation, it has a hard time with some angles. While this hasn’t changed with X800, we decided to take our Far Cry AF screenshots on a new map, mp_surf.
RADEON X800 XT Platinum Edition: [image]
While it’s easy to spot the difference between 8xAF and AF being disabled, it’s much harder differentiating between the ATI and NVIDIA boards in this scene. Textures are vibrant on both boards, once again it comes down to a judgment call on which card looks better.
System Setup
Benchmarks
Lock On: Modern Air Combat (Mig-29 custom demo)
3DMark 03 – Direct3D
3DMark 03 – Game Test Results
Notes
NVIDIA’s previous driver, Detonator 60.72 doesn’t recognize the GeForce 6800 GT, and was the last driver we’re aware of to receive FutureMark’s approval for GeForce6 series, so we relied on that driver again for our testing.
ShaderMark 2.0
Call of Duty – OpenGL
IL-2 Sturmovik: FB - OpenGL
IL-2 Sturmovik: FB - OpenGL
Quake III - OpenGL
Splinter Cell – Direct3D
Tomb Raider – Direct3D
Lock On: Modern Air Combat – Direct3D
Lock On: Modern Air Combat – Direct3D
Unreal Tournament 2004
Unreal Tournament 2004
Far Cry – Direct3D
Far Cry – Direct3D
Far Cry – Direct3D
Far Cry
Tomb Raider
Just when we thought NVIDIA’s GeForce 6800 Ultra was the new king of the hill, ATI comes back and reclaims the performance crown in many popular benchmarks, including next generation titles such as Far Cry. This news must have come as a stunning surprise to NVIDIA, who have quickly whipped up an overclocked GeForce 6800 Ultra part to try and keep up, but even NVIDIA’s overclocked 6800 Ultra Extreme part falls short of ATI’s impressive clock frequencies. If you recall previous architectures, it was NVIDIA that was breaking new ground in clock speeds. But thanks to X800’s 160 million transistor, 0.13-micron process with low-k, ATI is able to scale to higher clock speeds than NVIDIA can with GeForce 6800 without consuming large amounts of power or generating a lot of heat. Basically, we continue to be impressed by ATI’s engineering team. They were able to push TSMC’s 0.15-micron process to new levels with RADEON 9700 PRO and its follow-up parts, RADEON 9800 PRO and RADEON 9800 XT. Now they’re doing the same with 0.13-micron. ATI’s 3Dc compression technology could be a huge trump card for the X800 series provided its implemented by more game developers. The performance and quality benefits are dramatic and can be easily implemented. ATI has one killer app onboard, Half-Life 2, now they’re working on confirming more developers. Eventually ATI would like to get 3Dc (or some variant of it) integrated into a future variant of DirectX. Considering that two of the three engineers (Rick Bergman and Raja Koduri) who originally developed S3TC hold senior positions within ATI, they’ve certainly got the experience to make this happen. For added flexibility, ATI has added a new temporal AA mode to their repertoire. Temporal AA isn’t the perfect solution, it will require sustained high frame rates for the best image quality, but it’s nice to see ATI add another AA option for its users to play with. After all, if you don’t like it, you can always turn it off. We’re also impressed by ATI’s driver quality. While NVIDIA’s ForceWare 61 driver is riddled with bugs, we found our experience with ATI’s beta driver for the X800 series to be a much more pleasurable experience. For instance, the ForceWare control panel wasn’t always up to snuff, even as frames would occasionally pause for three seconds or more in IL-2 Sturmovik: Forgotten Battles regardless of the GeForce card used. Right now, we’d have to give the driver quality edge to ATI, despite the fact that their current CATALYST driver doesn’t provide all of the features currently found in ForceWare 50, much less NVIDIA’s upcoming 60 release (although we’ve been told that ATI’s CATALYST team has quite a few goodies in the works for ATI card owners). We’d take stability over features any day of the week in a display driver and this is exactly what ATI is delivering at this moment. ATI will also beat NVIDIA to market with X800. The X800 PRO has begun shipping to retailers as of today, while the high-end X800 XT Platinum Edition will be shipping on May 21st. The X800 PRO occupies the same $400 price point as the GeForce 6800 GT while the $500 X800 XT Platinum Edition will duke it out with NVIDIA’s GeForce 6800 Ultra. As of right now, we’d take the X800 XT Platinum Edition over NVIDIA’s GeForce 6800 Ultra if we were in the market for a graphics upgrade. It’s a close call at the $400 price point though. The X800 PRO has its fair share of positives, while the GeForce 6800 GT is no slouch either. We’ll have to wait for retail GeForce 6800 GT boards before a definitive answer can be determined. But based on our performance results and overclocking, ATI’s X800 PRO definitely won’t disappoint if you’re looking to upgrade. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| © Copyright 2003 FS Media, Inc. |