[ Print Article! ]

GeForce-FX Comdex Preview
November 18, 2002 Brandon Bell

Summary: Reporting in very much live from Comdex, more specifically NVIDIA's suite at the Bellagio, Brandon and Alan have details on the GeForce FX (formerly NV30.) Covering everything from basic stats like memory bandwidth, to what "FX Flow" really is, you can't miss this read!


IntroductionPage:: ( 1 / 6 )

Going into the holiday shopping season, NVIDIA finds itself in an unusual position – playing catch up to ATI. Not only has NVIDIA lost the high-end segment, but in a matter of weeks, their GeForce4 Titanium line will be under intense pressure from ATI’s RADEON 9500 family. On paper RADEON 9500 PRO looks like it could be a GeForce 4 Ti 4200-killer, which is one of NVIDIA’s most successful products this year. The vanilla RADEON 9700 cards that will be unleashed by ATI’s add-in board partners also outmatch NVIDIA’s flagship GeForce4 Ti 4600.


Clearly NVIDIA needs an answer to the R300 core that is powering all of ATI’s DirectX 9 products. That answer is GeForce FX (formerly known by its codename, NV30).

Like RADEON 9700, GeForce FX is a next-generation graphics part designed for DirectX 9. NVIDIA claims that GeForce FX will offer three times the frame rate of GeForce4 Ti 4600 and three times the vertex processing of GeForce4 Ti 4600. But that’s just for starters.

[image]
<% print_image("01"); %><% print_image("02"); %>

Advanced Features

GeForce FX will also be the first graphics accelerator to ship with DDR2 memory. DDR2 allows for unprecedented data rates, in the case of GeForce FX, NVIDIA is shooting for a minimum of 1GHz (officially operating at 500MHz). If NVIDIA is able to deliver on this figure, GeForce FX would boast twice the memory bandwidth of anything currently available on the market.

But it doesn’t stop there, NVIDIA’s GeForce FX boasts an enhanced, LightSpeed Memory Architecture 3 with one of the key additions being NVIDIA’s new color compression technology. This compression is completely loss-less, meaning there is no reduction in image quality of precision. As a result, memory bandwidth is used more efficiently, especially when anti-aliasing is enabled. Speaking of anti-aliasing, GeForce FX sports two new modes, a 6XS mode for Direct3D and a new 8X mode for both OpenGL and Direct3D.

Some of you may be surprised by the name of NVIDIA’s next generation part “GeForce FX”. As NVIDIA CEO Jen-Hsun Huang was previously quoted as saying that the GeForce moniker would not be used with NV30. Apparently, the marketing team has changed its mind on that decision. In fact, NVIDIA boasts that GeForce FX is the first product that is a result of the 3dfx/Gigapixel technology that was purchased two years ago, and thus, the name GeForce FX was chosen to reflect this.

But GeForce FX is about much more than its DDR2 memory or fill-rate performance, it also boasts new pixel and vertex shaders that go beyond DirectX 9, features a 128-bit floating-point pipeline, and is currently being demonstrated with an exotic cooling solution. Lets start with the increased precision GeForce FX offers.


FP16 and FP32 PrecisionPage:: ( 2 / 6 )

Increased precision

One of the key features that the next generation graphics accelerators such as
GeForceFX and RADEON 9700 bring are their support for higher precision 16-bit and 32-bit floating point formats (64-bit and 128-bit color). For the ultimate visual quality, developers can utilize the 32-bit floating-point format, delivering the same level of precision currently used in the film industry today.

[image]

<% print_image("03"); %><% print_image("04"); %>

Some of you may be wondering why the need for millions of colors, when no monitor is capable of supporting that range. As images become more complex, the 256 levels of each color today’s 32-bit accelerator’s are capable of generating aren’t enough to produce lifelike images. Rounding errors can occur, resulting in images with artifacts. A few situations where this occurs frequently are in bump maps, smoke, or fog.


Many of today’s 3D games use hacks to work around the limitations of lower precision. Another alternative that is often used is to perform these calculations on the CPU. As a result, these technologies limit what the programmer can do and lower performance.

Obviously rendering an entire scene in 128-bit color mode would require a considerable amount of horsepower (and thus kill frame rates if the hardware isn’t able to keep up), with this in mind GeForce FX also supports the 16-bit floating point format for situations when full 128-bit color isn’t required, and thus increasing performance.

Developers are free to move back and forth between both formats, using the format that is ideal for their application. For instance, high-resolution textures can utilize the 32-bit floating-point format to create objects with an unprecedented level of detail, other applications can be performed using the 16-bit floating-point format and the increased performance it offers.

[image]
<% print_image("05"); %><% print_image("06"); %>

Performance

During cinematic sequences game developers could use this to create highly detailed protagonists, while ordinary characters that aren’t crucial to the plot could be rendered with less detail. The end result is that developers can produce cinematic graphics in real time while optimizing the performance for every situation.

NVIDIA feels its GeForce FX GPU goes one step beyond RADEON 9700 PRO in the fact that it truly supports 128-bits of color. RADEON 9700 PRO is limited to just 96-bits of precision, which could prevent the RADEON 9700 PRO from matching GeForce FX’s 64-bit performance as the core may not be intelligent enough to split the data into 64-bits. It also won’t be able to match the visual quality of GeForce FX in 128-bit color mode.



2.0 Vertex/Pixel ShadersPage:: ( 3 / 6 )

The Vertex shaders

The CineFX engine that the GeForce FX is based on supports up to 65,536 instructions for vertex processing, up from 128. This allows developers to write longer vertex programs to create objects that are more complex. Another area where the CineFX architecture excels is in character animation.


With DirectX 8 accelerators, if an object used vertices that could be affected by multiple bones, a vertex shader would have to be written for each one. With DirectX 9, one shader can be written to perform the same task, easing development for the programmer.

[image]
<% print_image("07"); %><% print_image("08"); %>

But with RADEON 9700 the object must still be broken up and drawn separately. GeForce FX is unique in that it can branch the shader on a per vertex basis, and it is not required to break up the object. This improves performance and makes things easier for the developer.

Another new addition brought with the DirectX 9 2.0 vertex shaders is flow control. This gives the developer the ability to use conditional branching as well as subroutines to perform mathematical calculations as well as providing early termination of the vertex program if certain conditions are met. If additional calculations aren’t necessary to improve the visual quality, the program can be terminated and can start working on the next vertex (or load the next vertex program) instead of going through the remainder of the program (a la a DirectX 8 vertex shading engine).

Pixel shaders

DirectX 8 pixel shaders were limited to very simple pixel shader effects. This was because DirectX 8 pixel shaders only supported a limited set of instructions that were focused on various texture related operations. The 2.0 pixel shaders in DirectX 9 have all of the commands available in 2.0 vertex shaders plus a few instructions necessary for pixel processing.

[image]

<% print_image("09"); %>
Another advantage with 2.0 pixel shaders is that programs can be larger, allowing developers to create a wife variety of effects. Multiple textures can also be combined in a single, one pass shader for greater efficiency. In comparison, DirectX 8 support eight instructions.

The key point to remember with the new 2.0 pixel and vertex shaders is that they are incredibly more powerful than the 1.4 shaders available in DirectX 8. They are easier to write and more flexible, making life easier for developers. DirectX 9 shaders will also provide increased performance. Many effects that would have required multiple passes in DirectX 8 can be performed in a single pass by DirectX 9 shaders. If you recall the Wolfman demo used for the GeForce4 launch, Mr. Wolf’s fur required eight passes for every pixel. With DirectX 9, GeForce FX can render Wolfman’s fur in a single pass.



SIDEBAR: For more details on the GeForce FX’s new pixel and vertex shaders, please refer back to our NV3x preview from July


GeForce FX’s CoolingPage:: ( 4 / 6 )

FX Flow

In order to extract the maximum performance from the .13 micron, copper interconnect, flip-chip design of the GeForce FX, NVIDIA has yet again developed a custom cooling system for use on their premium boards. The FX Flow system is standard on all GeForce FX “Ultra” products, and guarantees that the enthusiast or die-hard gamer who is interested in the flagship version of the GeForce FX will not need to worry about compromises to the stability and overclockability of the product.

[image]

<% print_image("11"); %><% print_image("12"); %>

The Big Picture

From a big picture perspective, heat pipes carry heat from the GPU and memory to the copper radiator. A high-speed fan brings in cool air from the outside of the case and passes it over the copper radiator. Doing this requires an extra slot. At first glance, this may look like ABIT’s OTES design. While the FX Flow certainly looks like the OTES, the new NVIDIA solution is unique in that it uses cool air from the outside of the case to cool the radiator and is able to cooling the memory in addition to the GPU.

[image]

<% print_image("13"); %><% print_image("14"); %>

An obvious question to ask is how effective this cooling approach is, as compared to standard heatsink and fan combinations. We won’t know the definitive answer until the boards come in, but fortunately I have a first edition SMAD on my bookshelf, otherwise known as the “Space Mission Analysis and Design” reference textbook. This will help us figure out how a heat pipe is supposed to work, and how effective it is.


Heat Pipes 101

Heat pipes are proven designs that offer a highly-conductive path from your heat source and your radiator. The heat released by the GPU converts the fluid inside of the pipe to a gas that travels toward the radiator. At the radiator, the gas is cooled and condenses. Then, an internal wicking material brings the fluid back to the heat source where the cycle repeats. Since there are no moving parts involved in the transfer of heat, the systems are exceptionally reliable. According to the SMAD, a well-designed heat pipe transfers 200 to 300 times as much energy as a solid copper bar of the same diameter. The art in designing your heat pipe involve selecting the fluid inside the pipe, the pipe material itself, and the conditions.

[image]

<% print_image("10"); %>

For a heat pipe to work correctly, the heat given off at the GPU should be enough to boil the fluid, and the radiator should remove enough heat so that the material can condense. In other words, you need to know your temperature range beforehand and choose appropriate materials and conditions to do this. Similarly, the wicking material should bring condensed liquid back to the heat source as quickly as possible. Selection of the pipe material itself is also important. We all know about thermal conductivity when it comes to heatsinks and trust copper designs.

SIDEBAR: GeForce FX contains 125 million transistors


Heat Pipe DesignPage:: ( 5 / 6 )

Material Conductivity, W/(m-K)
Copper 389
Aluminum Alloy 1100 218
Aluminum Alloy 6063 209
Aluminum Alloy 3003 (soda cans) 156
Stainless Steel Type 304 17.3
Ethylene Glycol 2.42
Water 0.536
Air 0.026

With heat pipes, however, this is not how you want to choose your material. Some combinations of working fluids and pipe materials will generate a slow inert gas that deteriorates performance.


The last thing that’s interesting is that NVIDIA’s cooling system seems to have multiple pipes. We didn’t have a chance to get in touch with NVIDIA before press time, but there are a number of different possibilities. One, they may simply be using multiple heat pipes to increase the rate at which heat can be transferred from the GPU and memory to the radiator. It is possible, however, that the FX Flow is not a standard heat pipe, but a capillary pumped loop or looped heat pipe. In this design, the wicking system is bypassed for a circular loop design. In this design, the liquid and gas travel in a circular path, never coming in contact with each other. The driving force for the circulation is the pressure of the gas itself. This design is more interesting in that since the returning liquid does not encounter the hot gas, you get better thermal transfer.

Even if this isn’t what the FX Flow is using, this is where I think the future of high-end PC graphics cooling is heading, with companies trying to develop better geometries to the pipes and radiators, as well as selecting better materials. What about liquid cooling? The real difference between the passive and active cooling is that an electrical pump is used to move the material from the heat source and cold plate. It’s pretty much the same principle, only that the active pumping allows you to bring the heat a greater distance, and hence you can have an even better or larger radiator located outside the computer. It’s clearly not as practical and certainly is not necessarily better.

See? Why bother reading any other GeForce FX overview article, when it’s only FiringSquad who’s got the edutainment factor that gives you the knowledge that you can use to impress your overclocking friends! Alright, it’s back to Brandon…



ConclusionPage:: ( 6 / 6 )

On paper, GeForce FX certainly looks exciting. We were also able to spend some hands-on time with GeForce FX and were impressed by what it brings to the table. The screenshots NVIDIA is currently distributing among the journalist pool don’t do GeForce FX justice. NVIDIA’s “Dawn” demo in particular reminded us of the character detail in the Final Fantasy feature film a few years ago. Individual pores on Dawn’s skin could be seen up close, and they weren’t just limited to her face. We’re hoping that NVIDIA will be posting AVI’s of these demos in action so you can experience them for yourself.

In the meantime, we’ll have to wait a bit longer for GeForce FX. NVIDIA is currently shooting for a January/February timeframe, nearly a year after GeForce4’s initial debut. NVIDIA is shooting for a minimum 500MHz core/500MHz memory clock rate, although we’re unsure if this figure applies to their flagship card or one of the less expensive variants.


We’ll also be reporting more details on NVIDIA’s performance later today during the official press launch, including benchmark results, so be sure to check back tomorrow for those scores. Then on Tuesday we’ll be sitting down and chatting with NVIDIA engineers for even more details.

One thing is for sure, GeForce FX certainly looks more impressive than RADEON 9700 PRO on paper, but the question remains, when will NVIDIA deliver? By the time GeForce FX hits store shelves, ATI will likely be putting the finishing touches on its follow-up to RADEON 9700, codenamed R350. If that’s the case, GeForce FX’s reign on the throne may be short-lived.


SIDEBAR: What do you think of GeForce FX? Will NVIDIA reclaim the performance throne from ATI, or is ATI hear to stay? Voice your thoughts in the news comments!

© Copyright 2003 FS Media, Inc.
[ Print Article! | Close Window ]