Summary: FutureMark has incorporated a number of changes into 3DMark Vantage. The benchmark includes a new suite of tests designed to test 3D performance, GPU-based physics, CPU performance, and CPU/PPU-based physics. Join us as we explore the performance of over two dozen cards ranging from the Radeon 3450 and GeForce 8400 GS all the way up to the GeForce 9800 GX2 and 3870 X2 in CrossFire and SLI configurations. Which card comes out on top in the $50 bracket? $200? $300+? Find out in this article!
Why use 3DMark?
Technology pundits like to debate about the role of synthetic benchmarks. Some say that real-world "minimum fps" testing is the only way to go. After all, isn't the purpose of reviews and benchmarks to help you decide what to buy? This same camp argues that synthetic benchmarks are susceptible to manufacturer optimization and often do not reflect real programming practices. Synthetic benchmarks often lack the level of programmer-level optimization or may arbitrarily use inefficient code to "stress" the hardware.
What’s new in 3DMark Vantage?
The short answer: DirectX 10. A $5 million budget. No "free" version.
But the $5 million budget really shows. Not only does 3DMark Vantage have considerably upgraded tests which stress even the most powerful of GPUs, but there is now a robust set of CPU benchmark tools as well.
Besides the new tests, FutureMark has also incorporated a new scoring system which features four new presets: entry, performance, high, and extreme visual quality settings. Each preset has its own group of graphics settings, the entry setting for instance runs at lower resolution (1024x768) than the extreme setting, which runs at 1920x1200. The higher presets also turn on features such as AA/AF. These graphics settings only apply to the game tests, the CPU and feature tests are run at the same settings regardless of the preset you select.
When your final overall Vantage score is generated it includes the preset used. For instance if you used the “Performance” preset to test your GPU, your score will start with the letter “P” followed by your Vantage score (P3000 for instance). When running Vantage with one of the four presets, the overall score generated applies to that preset only, in other words, you can’t compare 3DMark scores with the Performance preset with 3DMark scores run with the High preset. FutureMark says that higher presets emphasize graphics performance more heavily than lower presets when determining the overall 3DMark score.
The following chart summarizes the settings for each preset in 3DMark Vantage:
The first game test: Jane Nash
Game Test 1: All of the usual suspects are featured here. Complex shadow maps. 16-bit floating point HDR lighting. Post-processing effects (motion blur, bloom, etc.). In addition to the usual suspects, the attention to physics is increased. There are particle systems and fluid surface simulations, seamlessly blended into the scene.
"Jane Nash" is remniscent of a 60's spy flick along the lines of James Bond crossed with Perfect Dark or No One Lives Forever. This indoor scene has complex geometry, simulations, and multiple dynamic lights. 3DMark Vantage's whitepaper notes:
The Jane Nash test scene represents a large indoor game scene with complex character rigs, physical GPU simulations, multiple dynamic lights, and complex surface lighting models. It uses several hierarchical rendering steps, including for water reflection and refraction, and physics simulation collision map rendering. The following features are specific to this scene:
When you first view the scene, you might not feel as if it's better than the "best" first person shooters on the PC right now. Part of this is art direction, but a substantial difference is the cost of physics. Hierarchical rendering and game physics are best appreciated in motion and when the demo is only running in the single digit fps range for most PCs, it's easy to discount the test as being inefficient or not complex enough. In truth, we believe it'll be a good synthetic test going forward.
Game Test 2: Calico
This space scene tests a completely different set of features. While the core architecture/engine is the same (HDR lighting, etc.) this scene is focused on a more traditional world in which you have a large number of rigid objects with superb lighting as opposed to fancy cloth simulation. This means that computational cycle-for-cycle, Calico's approach offers a richer visual experience. Futuremark's White Paper states:
The Calico test scene represents a vast space scene with lots of moving but rigid objects and special content like the planet and asteroid belt. The following features are specific to this scene:
People have always said that Futuremark makes benchmarks that you wish you could play. Call it nostalgia, but I hope Futuremark Game Studios decides to make a game based upon Test Scene 2, I'd probably be the first in line. A thematic decendant of Proxycon, the Calico test scene features a huge space battle along the lines of the opening in Star Wars Episode III: Return of the Sith. Origin (Wing Commander, Strike Commander) used to be *the* developer who made you want to upgrade your PC -- perhaps Futuremark Game Studios will be able to do that with their games.
In favor of developing a space shooter is the fact that development costs will be cheaper than other games such as a first person shooter or real-time strategy game due to ease of level design and a reduced artwork burden.
While the two graphics tests will receive the most attention, the CPU and feature-specific tests in 3DMark Vantage is also an impressive upgrade. These tests minimize the graphics load to such a degree than an "average" GPU will be able to handle the performance without trouble. There aren't any complex shaders or post-processing except for the "full-time HDR" renderer.
CPU Test 1: AI
Along the lines of Intel's technology demo, Ice Storm Fighters, this test uses a 3D path-finding algorithm to stress today's multicore CPUs. This feature will likely be important in the next-generation of real-time strategy games. In this test, the plane must follow the gates in order. Based upon physically-based flight model (what can feasibly be done in terms of changing your direction, etc), a number of random candidate paths are chosen. The best path from those random paths is selected. Repeat for every plane in the scene. One thread per CPU core is used.
Futuremark's white paper goes into extra detail to defend their algorithm as being a "reasonable" approach. On one hand, it seems inefficient to come up with a bunch of potential solutions and then to go for the best solution in that group. Why not solve for a the perfect solution first? A couple of reasons. Sometimes there is no ideal path. With this system, responsiveness of the path finding algorithm takes priority. Second, it's ensures built-in randomization so that objects don't look like robots. Lastly, it's an algorithm that is modular. By changing the "rules" of the flight model and the selection criteria defining "the best path," this algorithm can be used to determine a path for tanks, troops, and other types of airplanes.
I thought Calico had the most potential for a first-time developer's game. Once you develop a robust space engine, it's easy to create new levels/missions than would be the case with a first-person shooter. With this path finding algorithm, add +1 point to the speculation that Futuremark's first game is going to rely on "a ton of bad guys" on-screen.
CPU Test 2: Physics
This uses the Ageia PhysX library to simulate a plane breaking up into pieces. When an Ageia PPU is enabled, this test is hardware accelerated. In this test, planes crash into each other. The break-up on collision is physics-simulated as is the colored smoke. Cloth and soft-body materal properties are used for the gate. To parallelize the task, one gate is computed per CPU core.
A PPU is capable of computing 4 gate pairs (but requires one CPU core to simulate the rigid body simulations). This was done to ensure that peak utilization of the CPU and PPU are maintained.
Besides the new game and CPU tests, FutureMark has also reworked their feature tests, which test specific aspects of the GPU such as fill rate and shader performance. The feature tests do not affect the overall 3DMark score.
The following descriptions of the feature tests come from FutureMark’s reviewer’s guide for 3DMark Vantage:
Feature Test 1: Texture Fill
This test draws frames by filling the screen rectangle with values read from a tiny texture using multiple texture coordinates. The texture coordinates are rotated and scaled between each frame.
Feature Test 2: Color Fill
This test draws frames by drawing a rectangle across the screen multiple times. The color and alpha channels of each corner of the rectangle is animated. The pixel shader is pass-through. The interpolated color is written directly to the render target using alpha blending. The render target is in 16-bit floating-point format, currently the most relevant format for HDR rendering output.
Feature Test 3: Parallax Occlusion Mapping (Complex Pixel Shader)
This test draws frames by rendering a single rectangle (two triangles) on screen, seen from an animated camera position. The pixel shader uses the Parallax Occlusion Mapping technique to simulate complex geometry under the surface of the rectangle. Heavy ray-tracing operations against a huge depth-map determine the actual intersection of the view ray with the geometry. Further ray-tracing determines visibility of that point from multiple animated light sources. Finally, the surface is shaded using the relatively complex Strauss shading model.
This test represents a very complex, heavy pixel shader, containing massive amounts of texture reads (ray-tracing) and dynamic flow-control (ray-tracing, looping over multiple lights), as well as traditional lighting calculations (Strauss). All the geometry on screen is rendered on just two triangles, and simulated entirely in the pixel shader.
Feature Test 4: GPU Cloth
This test features physical simulation of cloth on the GPU. The simulation is performed as a vertex simulation using a combination of vertex shader and geometry shader stages, with several simulation passes needed for each simulation step. Stream out is used to cycle the cloth vertices from one simulation pass to the next. This test stresses the vertex shading, geometry shading and stream out features of the hardware.
Feature Test 5: GPU Particles
This test features physically simulated particle effects on the GPU. The simulation is performed as a vertex simulation, with each vertex representing a single particle. Stream out is used to cycle the particle vertices from one simulation pass to the next.
Feature Test 6: Perlin Noise (Math-heavy pixel shader)
This test features multiple octaves of Perlin noise evaluated in the pixel shader. Each color channel has its own noise function for added computational load. The Perlin noise function is a standard building block of procedural texturing approaches, and is very math-intensive to compute in a pixel-shader. This feature test emphasizes the arithmetic computing power of the graphics hardware.
Intel Core 2 Extreme QX9770 (3.2GHz)
Gigabyte X48T-DQ6 Intel X48 Motherboard
EVGA nForce 790i Motherboard
4GB OCZ Technology DDR3-1800 CAS7 Memory (4x1GB)
NVIDIA GeForce 9800 GX2
NVIDIA GeForce 9800 GTX
NVIDIA GeForce 8800 Ultra
NVIDIA GeForce 8800 GTX
NVIDIA GeForce 8800 GTS 512MB
NVIDIA GeForce 8800 GT 512MB
NVIDIA GeForce 9600 GT
NVIDIA GeForce 8600 GTS 256MB
NVIDIA GeForce 8600 GT 256MB
NVIDIA GeForce 8400GS
NVIDIA GeForce 8500 GT
AMD Radeon HD 3870 X2
AMD Radeon HD 3870
AMD Radeon HD 3850 512MB
AMD Radeon HD 3650 256MB
AMD Radeon HD 3450
Catalyst 8.471.1 Beta
Western Digital Raptor 150GB
Windows Vista Ultimate 64-bit w/Service Pack 1
Both NVIDIA and AMD supplied us with beta drivers for use with testing with 3DMark Vantage. For the most part, the drivers worked well, although we couldn’t get CrossFire to run with complete stability in Vantage for the 3850 and 3870 cards, while we also had problems loading Windows Vista with the 9800 GX2 Quad and ForceWare 175.12.
3DMark Vantage confirms what we told you about the GeForce 9800 GTX earlier this month as well. While its “9800” designation suggests it’s a significant improvement over the 8800s, it’s actually debatable if it’s much of an improvement over the GeForce 8800 GTX in pure 3D performance, although obviously its G92 GPU brings with it other improvements such as better video and HybridPower, that make it a compelling upgrade if you never made the jump to the GeForce 8800 line.
We’re not convinced if the new presets are a good thing or not. Part of the premise behind the new presets is that they will make the benchmark more attainable for a wider variety of hardware. The entry preset is designed for lower-end cards with 128MB of memory and integrated GPUs, while the performance preset is for mainstream cards with 256MB-512MB of memory. The high and extreme presets are designed for high-end cards with 512MB of memory or more, as well as multi-GPU setups like 3-Way and Quad SLI, as well as CrossFire X. The downside of the new preset system is that all the different Vantage scores could lead to more confusion. A perfect example of this is the 9600 GT example we just showed you above. Do you go with the overall result for the “performance” or the “high” preset? The answer is probably going to depend on how you game. If you’re an eye candy guy who wants all the visuals, the “high” preset should be given more emphasis, but if you crave frame rate above all else the performance preset is probably more important to you.
We also really miss the lack of a built-in image quality test feature. The scenes in the Calico test aren’t ideal for testing aspects like AA, but the Jane Nash game test could probably work.
Overall the introduction of 3DMark Vantage is a good thing for the industry in our opinion, although we don’t think Vantage will be as influential as previous iterations of 3DMark (3DMark 2001 definitely comes to mind). Had FutureMark released this benchmark a year ago it certainly would have been more significant, but nowadays reviewers have a number of actual, shipping games to choose from when it comes to benchmarking, and real games always trump synthetic benchmarks in our book.
|© Copyright 2003 FS Media, Inc.|