[ Print Article! ]

NVIDIA nForce4 SLI Intel Edition Performance Preview
April 05, 2005 Brandon Sandman Bell

Summary: After years of providing chipsets for AMD processors only, today NVIDIA officially unveils their first chipset for the Intel Pentium platform: the nForce4 SLI Intel Edition. As its name implies, this new chipset is based on the same fundamental technologies found in today's nForce4 SLI chipset for the Athlon 64, including GigE with firewall, Serial ATA II support, 7.1-channel audio, 10 USB ports, and of course, 20 lanes of PCI Express for SLI! But how does this new chipset perform? Find out in today's article!


IntroductionPage:: ( 1 / 20 )






Every time the question was asked, NVIDIA executives always gave the politically correct answer of what a wonderful partner AMD was and how they were happy to be in the AMD ecosystem. Early on, NVIDIA was also quick to admit their inexperience at designing, and most importantly, producing chipsets in massive quantities; this was a problem that nagged them when the original nForce chipset, and its follow-up part, nForce2 when they were first introduced. As far as NVIDIA was concerned, the AMD market was keeping them busy enough already, or at least that’s what they told us officially.

Today, after selling millions of nForce chipsets to the AMD market, it’s hard to argue against NVIDIA’s strategy.

[image]
<% print_image("01"); %><% print_image("02"); %>

Of course, another key reason why NVIDIA hasn’t produced an Intel chipset to date has nothing to do with chipset production or NVIDIA’s cozy relationship with AMD; the key hang-up between the two companies has been Intel’s asking price for a Pentium 4 bus license. Without a license for Intel’s front-side bus, NVIDIA couldn’t produce a Pentium 4 chipset without getting into legal trouble with Intel. In contrast to AMD, who licenses their bus for free (including HyperTransport) Intel enjoys royalties from the sale of any chipset that uses their bus to communicate to the P4 processor. And if you don’t own a license, Intel will vigorously defend their IP, just ask VIA. (NVIDIA gets around this issue on the Xbox because Microsoft holds a P3 bus license.)

Then, in November of last year, shortly after the first batch of nForce4 SLI systems and motherboards appeared, NVIDIA and Intel announced a broad cross-license agreement. According to the PR the companies signed a “multi-year patent cross-license agreement spanning multiple product lines and product generations. Additionally, the companies signed a multi-year chipset agreement for NVIDIA to license Intel’s front-side bus technology. This will enable NVIDIA to deliver the NVIDIA nForce platform technology on Intel-based systems.” In other words, as a result of the agreement, NVIDIA finally had the green light to produce an Intel-based chipset.

The product we’re taking a look at today, NVIDIA’s nForce4 SLI Intel Edition is the first product out of the gates from NVIDIA for the Intel market. With SLI support, it’s clearly targeted at one market: high-end desktop PCs.



New North BridgePage:: ( 2 / 20 )

Traditional North Bridge/South Bridge architecture

When AMD integrated the memory controller on the Athlon 64 processor’s core itself, a huge amount of real estate was freed up within the chipset. NVIDIA used this space to integrate all the functions found in the nForce3/nForce4 chipsets into a single chip. By moving to a single chip design, latency is reduced. Single-chip designs also free up more space on the board for motherboard manufacturers.

One key downside for the single-chip architecture though is the rapidly changing pace of the hardware industry. If a chipset manufacturer wishes to integrate a new storage technology for instance, the manufacturer has to redesign the entire chipset. Under the more traditional North Bridge/South Bridge configuration, the manufacturer would only have to redesign the South Bridge. This helps the chipset manufacturer integrate new technologies into existing chipsets more easily. The multiple follow-up versions to the original nForce2 chipset is a perfect example of this.

Since the Pentium 4 CPU doesn’t feature an integrated memory controller, the onus is once again on NVIDIA to produce a dedicated high-performance memory controller for their nForce4 SLI Intel Edition chipset.

[image]

<% print_image("03"); %>

nForce4 SLI Intel Edition SPP

With dedicated chips for the North Bridge and South Bridge, NVIDIA has dusted off an old acronym from the nForce/nForce2 days for the North Bridge of the chipset, the SPP, otherwise known as the system platform processor.

The key component of the SPP is the memory controller. Like NVIDIA’s TwinBank memory controller for nForce/nForce2, the NF4 SLI Intel Edition SPP is a 128-bit dual-channel memory controller, consisting of two 64-bit memory controllers. The SPP’s memory controller only supports DDR2 memory. NVIDIA chose not to support DDR memory because “supporting legacy DDR memory required multiple design compromises that would affect system performance”.

DDR2 memory types supported is quite robust. Not only does the SPP support the same 400MHz and 533MHz memory modules as the 925XE chipset, it also one ups 925XE by supporting 667MHz DDR2 memory speeds. At 667MHz, these modules are capable of supplying the processor with up to 10.6GB/sec of peak memory bandwidth, 2.0GB/sec more than DDR2-533, making it more than capable of keeping the latest 3.46GHz and 3.73GHz Pentium 4 Extreme Edition processors fed with data.

To achieve low latencies at high clock speeds, NVIDIA has added a number of enhancements to the SPP. For instance, in the NF4 SLI Intel Edition SPP, NVIDIA provides a dedicated address and command bus for each DIMM, rather than sharing busses across multiple DIMMs. NVIDIA claims that this allows them to hit high clock speeds with 1T address timing, reducing memory latency and thus improving performance.

[image]

<% print_image("04"); %><% print_image("05"); %>

Third generation DASP

First introduced in NVIDIA’s original nForce chipset, the Dynamic Adaptive Speculative Pre-Processor (DASP) acts as a data prefetch unit for the nForce4 SPP itself. If you’re not familiar with data prefetching, the concept is simple: the DASP intelligently looks for regular access patterns in memory access, predicts which data will be necessary next, and fetches and places that data inside its cache before it's actually needed. Once the CPU requests the data, it is available for the processor immediately, reducing system latency dramatically.

For nForce4 SLI’s SPP, NVIDIA has overhauled the data prefetcher for use with Intel CPUs, specifically keeping improvements in CPU prefetcher design in mind.




North Bridge (cont’d)Page:: ( 3 / 20 )

QuickSync


Each of the SPP’s twin memory controllers can support up to two DIMMs, for a maximum of up to four DIMMs on the motherboard. If 4GB DIMMs are used, the chipset can support up to 16GB of memory. Another interesting aspect of the SPP’s memory controller is that it can run in “ganged” mode, allowing it to act as one 128-bit memory controller even when the DIMM sockets are populated with memory modules of different configurations.

[image]

<% print_image("06"); %>

Besides the memory controller, the other main feature of the nForce4 SLI SPP is its PCI Express subsystem. Like the nForce4 SLI chipset for the AMD platform, NVIDIA provides twenty lanes of PCI Express for the Intel Edition of the chipset. Sixteen of these lanes are dedicated to the graphics processor, with three lanes reserved for the x1 PCI-E slots.

Just like the original nForce4 SLI, when running in SLI mode on the Intel Edition chipset, the two x16 graphics slots operate as two x8 slots for optimal performance. The uppermost card is designated as the “Master” card, while the graphics card beneath it is slaved to it. In testing, NVIDIA found this configuration yielded the best performance.

Of course, if the second card is removed, or if you don’t have two SLI-ready NVIDIA graphics cards, the chipset will devote all sixteen PCI Express lanes to the primary PCI-E slot.

Processor support/FSB

nForce4 SLI Intel Edition supports all of the latest Intel processors, including Intel’s dual-core “Smithfield” chips, such as the Pentium Extreme Edition 840, which was first introduced on Monday. Front-side bus speeds up to 1,066MHz are also fully supported by the chipset.

It is in this regard that NVIDIA engineers appear to be particularly proud. Over the past few weeks we’ve been told on multiple occasions just how scaleable the nForce4 SLI Intel Edition was designed to be, with slides cryptically referring to bus speed support of “1,066MHz FSB and beyond”.

However, while dual-core processors are supported, we were told that the chipset doesn’t support SMP. Instead NVIDIA is reserving dual processor support for a workstation-class part, akin to the nForce Professional family for servers and workstations on the AMD platform.

In terms of complexity, the SPP is fairly impressive, being built on a 0.13-micron manufacturing process and containing a whopping 61 million transistors! This is just two million transistors shy of NVIDIA’s GeForce4 GPU.



The MCPPage:: ( 4 / 20 )

[image]

<% print_image("07"); %>

Paired alongside the SPP is NVIDIA’s nForce4 SLI Intel Edition Media Communications Processor, otherwise known as the MCP. The MCP handles the traditional I/O and storage duties of a conventional South Bridge with aplomb, sporting such features as support for Serial ATA II (3Gb/sec), GigE with a built-in hardware-based Firewall, 7.1 audio, and more. Ironically enough, the pathway between the MCP and the SPP is none other than AMD’s own HyperTransport link.

Storage

As we just mentioned, the nForce4 SLI Intel Edition features robust storage support. The MCP utilizes dual controllers providing support for up to four Serial ATA hard drives (or four conventional IDE drives) and also supports native command queuing for improved disk performance.

All the cool features found in the storage subsystem of NVIDIA’s AMD platform are migrated over for the Intel chipset, including MediaShield Disk Alert, which can not only warn you when a drive fails, but also tells you exactly which Serial ATA connector the failed drive resides on. You’ve also got cross-controller RAID support, which allows you to build a RAID array consisting of both Serial ATA and parallel ATA hard drives, and RAID morphing, a feature which allows you to setup and build a new RAID array without having to format your hard drive and reinstall all your old applications.

Speaking of RAID, the chipset supports RAID Levels 0, 1, 0+1, and bootable RAID 5, as well as JBOD.

Networking and audio

Like Intel’s chipsets, NVIDIA’s MCP features Gigabit Ethernet support. A dedicated 2Gbps bi-directional link is used, maximizing the full potential of the controller. NVIDIA’s familiar hardware-based firewall (ActiveArmor) is also present, protecting your PC from spyware and hackers.

On the audio side of the equation, the nForce4 SLI Intel Edition supports 7.1 channel audio via AC’97 CODEC, it’s important to note that this isn’t Soundstorm, nor is it Intel’s High Definition “Azalia” audio. According to NVIDIA, motherboard manufacturers didn’t want Azalia audio support integrated into the chipset in order to keep costs down. They also argued that most enthusiasts who will be purchasing an nForce4 SLI Intel Edition motherboard will likely already have a high-end dedicated sound card. Considering that Audigy2 cards can be easily found for well under $100, this argument is definitely true, but likely to come as a disappointment to those of you who are still hoping for Soundstorm to make a return someday.

Besides the storage and networking duties, other I/O features supported by the MCP include up to 10 USB 2.0 ports, and 5 PCI slots. The chip is built on a 0.15-micron manufacturing process and contains 21 million transistors.


Infrastructure supportPage:: ( 5 / 20 )


According to NVIDIA, the first nForce4 SLI Intel Edition motherboards should hit retail sometime around the end of this month, with retail prices hovering in the $200 range, just as you saw with AMD-based SLI motherboards when they first launched. NVIDIA expects that as additional partners release their boards, prices will slowly drop to the $160-$190 price range most AMD-based SLI motherboards sell for today.

Memory

Launching alongside the new chipset and motherboards is NVIDIA’s new validation program for DDR2 memory manufacturers who have been tested and validated on the platform for speeds of up to 667MHz. (Specifically, compliant modules must surpass commodity JEDEC memory requirements and comply with minimal targeted performance levels, including 667MHz clock speeds with memory timings of 4-4-4-12-2T.)

Since a JEDEC spec for DDR2-667 doesn’t exist yet, the validation program is the best way for enthusiasts who plan on building nForce4 SLI Intel Edition systems to ensure that their memory modules will work properly with their motherboard at high clock speeds. Modules that have been approved will carry a special “Recommended for NVIDIA nForce4 SLI Intel Edition” badge. We used these Corsair 667MHz modules for our testing. These modules should hit retail later this month alongside the nForce4 motherboards.

Power

In terms of power required, we were told that NVIDIA’s general recommendation for the nForce4 SLI Intel Edition, as well as the AMD-based SLI platform, was a 550-watt power supply. In particular, the new nForce4 SLI Intel SPP and MCP consume about twice as much power as the CK8-04 chipset used in AMD platforms, 30-watts in the Intel chipset versus 15 in CK8-04. You’ll also need to take into account the higher power consumption of Intel processors. For instance, the 3.73GHz Pentium 4 Extreme Edition consumes about 119 watts. In comparison, the Athlon 64 FX-55 dissipates 105W.

When you factor this in with the additional power requirements of the Intel chipset, you’ll need an additional 29 watts for the Intel system in the example above.

Setup

If you’re familiar with the procedure for setting up SLI on NVIDIA’s AMD platform, the Intel SLI chipset will be pretty familiar. Simply orient the SLI selector card into the motherboard for dual GPU configuration, plug in two SLI-ready NVIDIA graphics cards (GPUs supported include the GeForce 6600 GT, 6800, 6800 GT, and 6800 Ultra), and connect them with the SLI connector that shipped with your motherboard.

Once the hardware is in place, boot up your system and install the latest ForceWare drivers from NVIDIA’s website. After a quick reboot, you can then check the “Enable SLI MultiGPU” box in the driver control panel, followed by another reboot, and you’re up and running with SLI!

If you recall, NVIDIA’s SLI implementation can run in one of two modes: alternate frame rendering, or split frame rendering.

Split frame rendering mode works just as it sounds – the card splits the workload horizontally across the screen. One card takes the upper portion while the second card takes the lower segment. The frame buffer data is then combined and sent to the monitor. It’s important to note that SFR doesn’t necessarily split the screen directly down the middle; in some scenes the lower portion of the screen may be more complex than the upper portion, or vice versa. NVIDIA has developed custom load-balancing algorithms that are designed to take this into account, and split the screen appropriately – if one GPU takes longer to render, no problem, the driver just gives that GPU less work to do.

In alternate frame rendering mode, each graphics card handles the entire frame in a scene, with the cards splitting up the workload on a frame-by-frame basis. In other words, graphics card one will handle frame number one, while graphics card two tackles frame two. The only downside to AFR is the perceived “lag” that may be felt in some fast-paced games, although we haven’t seen this, nor have we seen any reports of current SLI enthusiasts who have run into this problem.


nTunePage:: ( 6 / 20 )

With each new release, NVIDIA’s nTune software continues to get better and better. Originally known as the System Utility software, nTune provides the traditional hardware monitoring capability you’d expect from a third-party application, monitoring such critical aspects as CPU and system temperatures, fan speeds, and voltages, but goes many steps beyond that by also providing built-in tools for overclocking the front side bus and memory bus, tweaking memory timings, and even overclocking your graphics card!

[image]

<% print_image("07"); %><% print_image("08"); %>

That’s right, if you have a newer GeForce FX or better graphics card installed inside your system, nTune can be used to automatically overclock your graphics card without having to resort to using the Coolbits registry hack, or other third-party software applications.

With nTune’s profiles, you can automatically tune your system specifically for the best memory performance, or if you deal with large databases, the system can be optimized for best disk performance. Gamers will of course select the “Best Graphics Performance” setting, while those of you who like to watch movies on your PC will want to select “Silent Tuning”.

[image]
<% print_image("09"); %><% print_image("10"); %>

One new feature NVIDIA has added to nTune recently is dynamic application sensing. With this feature, nTune automatically recognizes what program you’re loading up, then loads the appropriate profile. Say for instance you load up your DVD playback software, nTune can detect PowerDVD loading up and automatically load the profile for silent settings. Then, once you launch a game, nTune can sense that, cranking up your system settings for maximum performance.

[image]
<% print_image("11"); %><% print_image("12"); %>

For manual overclocking, nTune provides settings for adjusting your graphics card speed, the chipset itself, memory timings, CPU and memory voltages, and fan settings, saving the end user lots of time that would otherwise be spent fiddling in BIOS. Once it’s done optimizing your system, nTune can then benchmark your system based on graphics, disk, and memory performance, where you can then compare your performance both before and after the modifications, as well as compare your system to NVIDIA’s baseline configuration, which is based on a similar system configuration.

[image]
<% print_image("13"); %><% print_image("14"); %>

The real beauty of nTune though is its size. For an application that performs so many functions, you’d expect it to take up a huge amount of system resources. Fortunately, this isn’t the case at all, as we’ve found that nTune only takes up a small memory footprint.

Of course, the final nTune implementation depends on your particular motherboard manufacturer. If a manufacturer wishes, they can disable certain features to prevent end user’s from damaging their hardware (or some features just may not work properly, for instance fan speed control), or they may skip support for nTune altogether. Since this is a product targeted at enthusiasts though, hopefully all nForce4 SLI Intel Edition motherboard manufacturers will get onboard and provide full support for nTune in their retail nForce4 products.



Test systemsPage:: ( 7 / 20 )

System Setup


Intel Pentium 4 Extreme Edition 3.73GHz

NVIDIA nForce4 SLI Intel Edition reference motherboard

1GB Corsair XMS2 DDR2-667

MSI NX6800 GeForce 6800
NVIDIA GeForce 6800 GT reference card PCI-E
EVGA GeForce 6600 GT PCI-E
Driver version 71.84 final

250GB Maxtor Hard Drive Maxline III SATA Hard Drive w/16MB Cache

Windows XP Professional SP1

DirectX 9.0c

Benchmarks

Lock On: Modern Air Combat (Mig-29 custom demo)
IL-2 Sturmovik: Forgotten Battles (The Black Death track)
Pacific Fighters (kamikaze demo)
Far Cry 1.3
DOOM 3 (gameplay custom demo)
Chronicles of Riddick



Lock On: Modern Air CombatPage:: ( 8 / 20 )

Lock On: Modern Air Combat – Direct3D





Lock On: Modern Air Combat Performance 1280x1024
CardMin FPSMax FPS
GeForce 6800 GT SLI2654
GeForce 6800 GT2860
GeForce 6800 SLI2040
GeForce 68002144
GeForce 6600 GT SLIN/AN/A
GeForce 6600 GTN/AN/A




IL-2 Sturmovik: Forgotten BattlesPage:: ( 9 / 20 )

IL-2 Sturmovik: FB - OpenGL





IL-2 Sturmovik Performance 1280x1024
CardMin FPSMax FPS
GeForce 6800 GT SLI33105
GeForce 6800 GT30108
GeForce 6800 SLI33101
GeForce 68002462
GeForce 6600 GT SLI3296
GeForce 6600 GT1950




Pacific FightersPage:: ( 10 / 20 )

Pacific Fighters - OpenGL





Pacific Fighters Performance 1280x1024
CardMin FPSMax FPS
GeForce 6800 GT SLI14124
GeForce 6800 GT15124
GeForce 6800 SLI10117
GeForce 680010118
GeForce 6600 GT SLI8100
GeForce 6600 GT8100





Far Cry VolcanoPage:: ( 11 / 20 )

Far Cry – Direct3D





Far Cry Performance 1280x1024
CardMin FPSMax FPS
GeForce 6800 GT SLI56.2189.5
GeForce 6800 GT37.8120
GeForce 6800 SLI41.3141.1
GeForce 680026.483.4
GeForce 6600 GT SLI34.5117.4
GeForce 6600 GT23.467.5




Far Cry TrainingPage:: ( 12 / 20 )

Far Cry – Direct3D





Far Cry Performance 1280x1024
CardMin FPSMax FPS
GeForce 6800 GT SLI44.6165.6
GeForce 6800 GT35.294.2
GeForce 6800 SLI35.2112.7
GeForce 680025.965
GeForce 6600 GT SLI27.796.3
GeForce 6600 GT21.354.9




DOOM 3 High QualityPage:: ( 13 / 20 )

DOOM 3 – OpenGL







DOOM 3 Ultra QualityPage:: ( 14 / 20 )

DOOM 3 – OpenGL









Half-Life 2Page:: ( 15 / 20 )

Half-Life 2 – Direct3D







RiddickPage:: ( 16 / 20 )

Chronicles of Riddick







Splinter Cell: Chaos TheoryPage:: ( 17 / 20 )

Splinter Cell: Chaos Theory – Direct3D





Chaos Theory Performance 1280x1024
CardMin FPSMax FPS
GeForce 6800 GT SLI41145.2
GeForce 6800 GT23.199.8
GeForce 6800 SLI28.1124.6
GeForce 680015.565.6
GeForce 6600 GT SLI3.1109.5
GeForce 6600 GT13.557.2




2048x1536 Gaming-ShootersPage:: ( 18 / 20 )








2048x1536 Gaming-Flight simsPage:: ( 19 / 20 )






ConclusionPage:: ( 20 / 20 )


The Intel Edition of the nForce4 SLI chipset incorporates all the major features found on its AMD counterpart, including GigE with native Firewall, 10 USB ports, NVIDIA’s impressive storage subsystem, and even AMD HyperTransport, which links the North Bridge and the South Bridge together. The only difference is that NVIDIA has adapted the chipset for Intel’s Pentium processors, this includes adding a new memory controller with support of the latest DDR2-667 memory, and of course Intel’s 1066MHz FSB. From a features perspective, NVIDIA’s nForce4 SLI Intel Edition chipset is in a unique class of its own, even excluding SLI support.

But of course, no gamer willing to fork over $200 for a motherboard and another $400 or more for two graphics cards will forget about SLI. We witnessed performance gains that were comparable to the improvements seen on NVIDIA’s SLI platform for AMD users, sometimes in the order of just over 1.7X at 1600x1200 with 4xAA and 16xAF, but there were even multiple cases where we were pushing a 2X performance improvement under the same settings! Based on these kinds of results, clearly NVIDIA’s driver team has implemented quite a few performance enhancements inside their latest ForceWare release for SLI.

At the same time however, there’s still a lot of work to be done. NVIDIA currently boasts SLI support for over games, but this is a small selection of the overall gaming market. In addition, some of the titles on the existing list aren’t quite up to snuff. Chronicles of Riddick is a perfect example of this, with SLI enabled, stability is severely compromised, and when you do get numbers, they’re below a single card configuration. We also ran into performance problems with the GeForce 6600 GT running in SLI mode in Splinter Cell: Chaos Theory. Performance would consistently begin to hitch in the same area of the timedemo, so we’re pretty sure the problem wasn’t overheating.

Speaking of overheating, this is one aspect you’ll definitely have to take into account when building an nForce4 SLI Intel Edition system, especially if you plan on outfitting your system with a fast processor. Under load with our Pentium 4 3.73GHz Extreme Edition CPU, the GeForce 6800 GT cards running in SLI mode would begin to overheat when running looped demos in Far Cry for more than 10 minutes, causing the system to crash and a full reboot. We ultimately rectified the problem by removing the case cover our system shipped in (a CoolerMaster WaveMaster chassis with an NVIDIA case window on one side) so we could install an additional case fan to act as a blowhole, blowing cool air directly onto the graphics cards.

The problem is caused by inadequate airflow. NVIDIA’s cards feature ducted cooling designs. These coolers work great when they have a steady supply of fresh air, but with the secondary “slave” graphics card in the way, airflow to the primary “master” graphics card is constrained – it’s literally sucking up the hot air off the card below it! As a result, the primary card typically operates 5-10 degrees Celsius higher than the secondary card.

Teething problems aside though, NVIDIA’s SLI platform processors have brought quite a bit of excitement to the normally mundane chipset world. Obviously Intel recognizes that as well, why else would they sign NVIDIA so quickly after years of dismissing them previously? If you’re a hardcore gamer looking to get the most performance out of the Intel platform as possible, a motherboard based on NVIDIA’s nForce4 SLI Intel Edition chipset should be at the top of your list of components to purchase.

© Copyright 2003 FS Media, Inc.
[ Print Article! | Close Window ]