[ Print Article! ]

PhysX Performance with GeForce
August 06, 2008 Brandon Sandman Bell

Summary: Later this month NVIDIA will open up GeForce-based PhysX processing to their entire range of GeForce 8/9 and GTX 200 GPUs. In this article we take a look at their performance (as well as ATI's Radeon HD 4000 series), in four different PhysX applications. What kind of performance can you expect? Find out inside!


IntroductionPage:: ( 1 / 10 )


AGEIA was the first and only company to provide dedicated hardware for in-game physics. Their PhysX card launched with tons of buzz in 2006 with the company touting that PhysX would change the way games are played. Initially AGEIA was aiming for effects such as realistic smoke and explosions, and fluid and cloth animation, but eventually AGEIA envisioned a future where the physics effects produced by their PhysX processor could affect gameplay elements.

Taking PhysX from concept to the public

After taking a look at a card sporting AGEIA’s PhysX PPU, we concluded by saying that the card needs a killer app that really takes advantage of the chip. As the months of waiting wore on, the PPU concept slowly faded from the forefront of our minds to the complete backburner; it was an interesting idea, and AGEIA had certainly put together a few really nice technology demonstrations, but without that killer app the technology just wasn’t very compelling.

To their credit, AGEIA did not give up. While a blockbuster application that really took full advantage of PhysX didn’t arrive in 2006, the company was working hard behind the scenes on expanding the PhysX platform. The PhysX SDK was released to Playstation 3 developers in early 2006, and AGEIA signed on dozens of additional game developers later that year, including BioWare. Things really heated up in 2007. Not only did we see the release of Unreal Tournament 3, but Cell Factor: Revolution and Warmonger were also released for free to the public.

What really got the ball rolling for AGEIA in our opinion was ironically Intel’s purchase of Havok, rumored to be in the ballpark of $110 million. When a company like Intel forks over that kind of money, it’s a big deal. Intel’s purchase of Havok validated everything Havok and to a lesser extent, AGEIA was doing, and in the process AGEIA became the only independent company left with physics technology. Up to that point both ATI and NVIDIA had publicly downplayed the importance of AGEIA’s PhysX, promoting GPU-based physics via Havok FX instead.

With the Havok FX initiative floundering and now in the hands of Intel, AGEIA’s status as the last man standing suddenly made them highly attractive to both AMD and NVIDIA, and ultimately NVIDIA purchased AGEIA in February of this year with the goal of bringing PhysX to the GeForce platform of GPUs. AGEIA still didn’t have a killer app, but with over 140 titles in development or shipping with PhysX support, and full support for all the major game consoles, NVIDIA saw a huge opportunity that they couldn’t pass up: with their huge network of game developers participating in their The Way It’s Meant To Be Played program and over 70 million programmable GPUs floating around out there, it seemed like the perfect fit.

Bringing PhysX to GeForce

Immediately after the AGEIA acquisition, work began on porting AGEIA’s PhysX SDK using CUDA, and now this effort is about to pay off. On August 12th NVIDIA will be opening up GeForce-based PhysX support to all GeForce 8, 9, and GTX 200 series users. Today we’re going to take a quick look at what’s in store for gamers later this month!



Installation and SLI PhysXPage:: ( 2 / 10 )

You’re going to need a few things before you can start enjoying PhysX on your CUDA-enabled GeForce 8, 9, or GeForce GTX 200 GPU. For starters you’ll need the right graphics driver. We conducted all of our tests with a slightly different build of ForceWare 177.79 than the driver that’s available on nvidia.com. NVIDIA will be providing this PhysX-enabled graphics driver for everyone on the 12th.

The second thing you’ll need is the latest PhysX driver, our tests were run with NVIDIA’s 8.07.18 PhysX driver. For existing games, you will also need the latest PhysX-enabled patch.

[image]

<% print_image("01"); %><% print_image("02"); %>

Once you’ve got all that in place, you’re ready to go. PhysX can be toggled on or off via the PhysX control panel. From the PhysX control panel you can also check out a few PhysX demos, and if you’ve got two cards installed in your system, you can also select your PhysX SLI mode.


PhysX SLI

For running PhysX across multiple GPUs, NVIDIA provides two modes: conventional SLI, and multi-GPU mode.

In SLI mode, each GPU in your system shares the physics processing and graphics workload evenly. This is the default mode when SLI is enabled in the ForceWare control panel.

[image]

<% print_image("03"); %><% print_image("04"); %>

In multi-GPU mode, one GPU handles all graphics duties, while the second card is solely responsible for tackling all physics commands. Under this mode end users can mix and match different cards. For instance, someone who purchased a GeForce 8800 GT last year can run this card for physics, and then pick up a brand new GeForce GTX 280 to run as the primary graphics card for gaming. An SLI motherboard isn’t required for this mode either, both SLI and non-SLI motherboards are supported.

One limitation of this mode under Vista is that a second monitor must be attached to run PhysX on the secondary GPU, and you must extend the Vista desktop onto this secondary monitor. According to NVIDIA this limitation is related to the Windows Vista display driver model (WDDM). However, the same limitation does not exist in Windows XP.

If you don’t have a second monitor available, you can also run a second display cable from your monitor to the second GeForce card running PhysX.

NVIDIA is working on a workaround for this issue, but we don’t have an ETA on when this will be available or how it will work.

To answer some of the questions surrounding SLI PhysX, NVIDIA has come up with the following FAQ:

Question: Which graphics cards can accelerate NVIDIA PhysX?

A: All GeForce 8, 9, and GTX 200 series graphics cards with at least 256MB of local onboard graphics memory will be able to accelerate NVIDIA PhysX.

Question: Do you support PhysX on SLI now?

A: Running PhysX in SLI performance mode (SLI on), where each GPU renders a separate frame that includes both graphics and physics workloads, is supported in our August Release 177 drivers, but it’s still a work in progress. Our initial focus with these drivers is to either use a single GPU to run graphics and physics, or run graphics on one GPU and physics on a second GPU. The latter “multi-GPU” configuration currently has a limitation under Vista (not XP) that requires a monitor attached to each GPU (extending the desktop to the second monitor), or a monitor cable attached to both GPUs and connected to a single monitor. Future drivers will remove this limitation.

Question: How will PhysX scale from single card to 2-way and 3-way SLI in the future?

A: PhysX will support multi-GPU setups. Future drivers will deliver better performance and user experiences.

Question: Can I run PhysX on a different card than the one I use for gaming? When will you release support for asymmetrical multi-GPU configurations?

A: Yes. PhysX will support asymmetric multi-GPU configurations within the next few months. There is some limited Beta support in the first release, but the support will improve in future releases. This will enable users to still use their older 8 or 9-Series card when they upgrade to a newer card based on the GeForce GTX 280 for example.

Question: Can I run PhysX on my motherboard GPU?

A: PhysX uses NVIDIA CUDA technology, so if the motherboard GPU supports CUDA, then it can be used for PhysX. Generally running both graphics and PhysX on a motherboard GPU may not deliver the best experience; adding a discrete GeForce GPU significantly improves performance.

Question: Does PhysX scale across the GPU and CPU? If yes, does that mean having a faster CPU enhances PhysX performance or visual quality?

A: PhysX uses both the CPU and GPU, but generally the most computationally intensive operations are done on the GPU. A CPU upgrade could result in some performance improvement, as would a GPU upgrade, but the relative improvement is very dependent on the initial balance of the system. An optimized PC with the right mix of CPU to GPU horsepower will be the best balanced solution.

Question: How does PhysX support heterogeneous computing?
A: PhysX shows how heterogeneous computing delivers the best user experience. While the game is running, the PhysX system executes portions of the physical simulation on the CPU and other portions on the parallel processors of the GPU. This ensures all the components of a balanced PC are used efficiently to deliver the best experience.

Question: Why do some PhysX demos only run on one CPU core?
A: PhysX fully supports applications which are multi-threaded to leverage multiple CPU cores. Some single-feature PhysX demos consist of only one type of physics body which runs more efficiently on a single CPU core. If the same demo was part of a more complete game environment, there would be multiple PhysX objects and active game and rendering threads, so the game would leverage more than one CPU core more efficiently. Improving the way individual PhysX objects can be efficiently spread over multi-processors is being actively researched and developed.

Question: Will running PhysX on a GPU slow down gaming performance?
A: Running physics on the GPU is typically significantly faster than running physics on the CPU, so overall game performance is improved and frame rates can be much faster. However, adding physics can also impact performance in much the same way that anti-aliasing impacts performance. Gamers always enable AA modes if they can because AA makes the game look better. Gamers will similarly enable physics on their GPUs so long as frame rates remain playable. With AA enabled, running physics on a GPU will generally be much faster than running physics on a CPU when AA is enabled. PhysX running on a dedicated GPU allows offloading the PhysX processing from the GPU used for standard graphics rendering, resulting in an optimal usage of processing capabilities in a system.

Question: Intel and AMD say it’s better to run physics on the CPU. What is NVIDIA’s position?

A: PhysX runs faster and will deliver more realism by running on the GPU. Running PhysX on a mid-to-high-end GeForce GPU will enable 10-20 times more effects and visual fidelity than physics running on a high-end CPU. Portions of PhysX processing actually run on both the CPU and GPU, leveraging the best of both architectures to deliver the best experience to the user. More importantly, PhysX can scale with the GPU hardware inside your PC. Intel and AMD solutions, which utilize the Havok API, are fixed function only and cannot scale.



System SetupPage:: ( 3 / 10 )

Intel Core 2 Extreme QX9770

EVGA nForce 790i Ultra SLI motherboard (for GeForce cards)
ASUS P5E3 Premium WiFi AP Edition (for Radeon cards)
4GB OCZ DDR3 @ 1333MHz

GeForce GTX 280
GeForce GTX 260
GeForce 9800 GTX
GeForce 9600 GT
GeForce 8800 GT
ForceWare 177.79

AMD Radeon HD 4850
AMD Radeon HD 4870
Catalyst 7.7

300GB Western Digital Caviar SE

Windows Vista Ultimate 64-bit w/Service Pack 1


Notes

We’ve only had NVIDIA’s PhysX driver for a handful of days, so unfortunately we’ll have to resort to testing one common resolution among all the cards. Normally we like to include a couple of resolutions in our benchmarks. We chose a mixture of 1280x1024 (the only resolution supported in Nurien) and 1600x1200. As you can imagine, it takes some time for some of the results with slower hardware to generate, so we also kept the AA in check, but we did test the game’s with their highest settings.



Unreal Tournament 3 with PhysXPage:: ( 4 / 10 )

[image]

<% print_image("05"); %><% print_image("06"); %>

Arguably the most prominent game to support PhysX is Epic’s Unreal Tournament 3. For most maps, the physics load is quite light and you’d hardly notice anything special, so you’ll need to download the PhysX mod pack to really showcase what PhysX can do. The mod pack contains three custom levels, Heat Ray, Tornado, and Lighthouse. As its name implies, the Tornado map consists of a large tornado that’s sweeping through the map. With its high winds, the tornado shreds buildings, and if you get to close, it will suck you right in. The map also contains several destructible areas. Lighthouse takes that to another level, the whole world can practically be destroyed and its highly intensive.

[image]
<% print_image("07"); %><% print_image("08"); %><% print_image("09"); %>

Heat Ray also has a few destructible areas and features a gravity gun that lobs debris, hail, and other nearby objects at your opponent. This is the level we used for all our testing (we conducted our tests with a 30-player botmatch).

So how did the GeForce cards perform in UT3? Let’s see:




As you’d expect, the GeForce GTX 260 and 280 cards pulled away from the older GeForce boards. The 260 ran 35% faster than the GeForce 9800 GTX, while the GTX 280 was 5% faster than the GTX 260.

All of the PhysX-enabled GeForce boards ran significantly slower once PhysX effects were run on the CPU. GeForce GTX 280 performance is nearly three times slower when PhysX is running on the CPU, versus running those same effects on the GTX 280 GPU.

CPU PhysX performance is even worse with the older GPUs. The GeForce 9600 GT and 8800 GT both ran 3.5 times slower, while the 8800 GTS 640MB is four times slower with UT3 PhysX running on the CPU.



WarmongerPage:: ( 5 / 10 )


[image]

<% print_image("10"); %><% print_image("11"); %><% print_image("12"); %>

Based on Unreal Engine 3, Warmonger: Operation Downtown Destruction is a totally free multiplayer shooter developed by Netdevil. The game uses PhysX extensively. The entire environment is designed to essentially be destroyed. Don’t want to go down a set path to reach your enemy? Fine, blow a hole in the wall to create a new way. The game features realistic smoke and fog effects that are affected by the wind and any explosions you may set off. You can even create your own cover, but be careful, because your cover can easily be destroyed with a few well-placed shots.

Basically if you enjoy blowing stuff up, you’ll enjoy watching the explosions in Warmonger.

[image]
<% print_image("13"); %><% print_image("14"); %><% print_image("15"); %>

Once again the GeForce cards saw a huge boost in performance once the game’s PhysX effects were run on the GPU rather than the CPU:




GeForce-based PhysX performance is way ahead of PhysX performance on the CPU. Like UT3, the GTX 260 and 280 ran nearly three times faster on the GPU than the CPU. The older GeForce boards ran about three times slower on the CPU as well.

Once PhysX is enabled on the GPU, the GeForce 8/9 series GPUs trail the GTX 200 GPUs by over 30%. Considering the differences in their architectures, the 8/9 series GPUs also perform awfully close to one another. The 9600 GT runs just 8% slower than the GeForce 9800 GTX, despite the fact that it has half the shaders. In UT3 the 9600 GT was 20% slower than the 9800 GTX.



MKZPage:: ( 6 / 10 )

Short for Metal Knight Zero, MKZ is another game that will be offered for free when it’s released, with the developer making money instead via microtransactions. The game is part MMO, part FPS (MMOFPS). We don’t know much about the game other than that it is set in the near future and is being developed by Chinese developer Object Software. The game is built around a custom game engine with PhysX support added on top.

[image]

<% print_image("16"); %><% print_image("17"); %>

Like the previous games we’ve tested, MKZ features destructible objects. Billboards, barrels, boxes, etc all react realistically when you shoot them. It also supports realistic explosions. According to NVIDIA, up to 2,000 particles can be released at once from one explosion. The game also uses PhysX for cloth simulation. Flags will sway in the wind realistically according to wind speed and direction. They also can be torn. Each hanging curtain in our test demo has 2,000 vertices.

[image]
<% print_image("18"); %><% print_image("19"); %><% print_image("20"); %>

The game is still in the pre-alpha stage, so it’s still very early in development, but we were given a sneak peek into what the game will offer when it’s released, and you will get a chance to check it out for yourself on the 12th as part of NVIDIA’s first PhysX pack.




Notes

The GeForce 8/9 series GPUs perform much closer to GTX 200 in MKZ. We asked NVIDIA’s James Wang what could potentially be causing this and he theorized that MKZ isn’t taking advantage of the GTX 200’s additional stream processors:

You will find cases with early PhysX apps where a G92 part performs similarly, or sometimes faster than a GT200 when the PhysX workload is not very high.

When the workload is moderate, there are not enough threads to saturate all 240 stream processors. So if for MKZ, the PhysX workload saturates 128 stream processors, then the G92 will actually run faster than GT200, since its SPs are clocked higher. (PhysX and graphics threads do not execute in the same clock cycle, they are time sliced.)

Of course, for graphics GT200 will still fun faster, so on balance it may be smiliar.

As apps mature they'll use larger workoads. Then these corner cases should disappear.




NurienPage:: ( 7 / 10 )

[image]

<% print_image("21"); %><% print_image("22"); %>

Nurien is another game in development that we’re testing with today. It’s a social networking game powered by Epic’s Unreal Engine 3. You’ll start the game by creating your own custom 3D avatar. Based on the demonstration we were given at NVIDIA’s Editor’s Day, you’ll have a wide variety (3,000+) of things to tinker with -- Nurien’s character creation system reminds us a lot like Sony’s Playstation Home.

[image]
<% print_image("23"); %><% print_image("24"); %><% print_image("25"); %>

Once your avatar is fully decked out, you’ll use your character in a variety of mini-games ranging from dancing (think DDR), singing (SingStar), and trivia, all online. The co-founder of Nurien likens it to a 3D version of MySpace: “We wanted to give users a powerful new vehicle for self-expression that breaks down real world limitations. We at Nurien believe that if people created homepages in the ‘90s to express themselves, and if they have migrated to blogging and using social networking sites like Facebook and MySpace today, then the next natural step will be to inhabit fun, virtual worlds with user-created avatars.” Considering Nurien Software’s Korean roots, this game will probably be a huge hit over there.

Once again, the GeForce GTX 200 cards shine:




In comparison to the CPU, GPU-based PhysX shines in Nurien, although the performance gain isn’t quite as significant in this app as it was in others. In only one case was the CPU 2X faster than the GPU (GeForce 8800 GTS 640MB). The GeForce GTX 260 and 280 were separated by 6% in our testing, with the 9800 GTX finishing another 16% behind the GTX 260.




SLI PhysX PerformancePage:: ( 8 / 10 )


To see how the various GeForce cards perform in this mode, we’ve gathered four different multi-GPU setups: one with two GeForce GTX 280 cards, a second with two GeForce 9800 GTX’s, and the third with dual 9600 GTs. Finally we included a mix and match setup which paired one GeForce 8800 GT with a 9600 GT, with the 8800 GT handling graphics work and the 9600 GT tackling physics:






The dual GeForce GTX 280 rig is completely CPU bound in all our tests. As a result, the 2xGeForce 9800 GTX config is able to match the GTX 280s in performance. Due to time constraints (we’ve had the PhysX driver for only 5 days) we didn’t get a chance to try the GTX 280s in a more real-world scenario with higher resolutions and AA/AF. In this environment the GTX 280s would likely pull a little further away from the other testbeds.

With the exception of MKZ, the multi-GPU results looked very promising: performance is basically on par with dual GeForce 9800 GTX’s at a lower cost.



Radeon vs GeForce PerformancePage:: ( 9 / 10 )







With the Radeon 4850 and 4870 boards running PhysX via the CPU, as you’d expect based on the benchmarks you’ve seen on the previous pages it’s no surprise to see the GeForce cards well ahead of ATI’s latest offerings. Even the GeForce 9600 GT and GeForce 8800 GTS 640MB are well ahead of RV770.

However, if you scroll back to the other pages and look up the CPU PhysX numbers, you’ll see that both Radeon boards are faster than the GeForce 9800 GTX, and the RV770 boards put up numbers similar to the GeForce GTX 260 running PhysX on the CPU. Kinda makes you wonder how well the ATI cards would fare if they supported PhysX directly huh?



Final thoughtsPage:: ( 10 / 10 )


But there are some lingering questions. Now that the installed base is there, will game developers finally devote more effort towards integrating more compelling physics effects into their games? Obviously AGEIA managed to sign on a ton of game devs before they were purchased by NVIDIA, and Epic’s PhysX integration into Unreal Engine 3 – the most prevalent game engine on the market right now – certainly helps. But are developers ready to take in-game physics to that next level of immersion just yet? As we’ve been saying all along, it’s ultimately up to the quality of the games that’s going to really get physics going into overdrive.

So which upcoming PhysX games can we look forward too? NVIDIA provides the following list:

  • Aliens: Colonial Marines
  • Backbreaker
  • Bionic Commando
  • Borderlands
  • Cryostasis
  • Empire: Total War
  • Mirror’s Edge
  • MKZ
  • Nurien

    We saw a few of these titles at NVIDIA’s GeForce GTX 200 Editor’s Day back in May; the football game Backbreaker really showcased the potential for PhysX and we now wish EA would integrate the tech into their next Madden game. Unfortunately we haven’t seen the PhysX implementation for Mirror’s Edge. Like UT3 was last year, this is another highly anticipated game that we’re looking forward to.

    And what about AMD? Right now they’re focusing their physics efforts on Havok. As far as they’re concerned, the GPU and the CPU should both play an important role in game physics. In cases where it’s more efficient to handle physics on the CPU, they feel the CPU should be used. In cases where it makes sense to offload physics onto the GPU, their RV770 GPU should be used to handle those effects (Havok and ATI are still working on which physics effects, if any could be implemented on the GPU). Officially ATI has no plans to embrace PhysX at this time.

    This is an awful shame in our opinion. Apparently ATI/AMD can work in harmony with Havok/Intel on this topic, but ATI can’t work with NVIDIA. Old habits are apparently hard to break in this case.

    UPDATE 8/14/08:Looking for benchmarks with AGEIA's dedicated PhysX card? If so, you'll want to check out our PhysX Performance Update article, where we compare the performance of a dedicated PPU to the GPU and CPU.
  • © Copyright 2003 FS Media, Inc.
    [ Print Article! | Close Window ]