||NVIDIA's SLI Technology
June 28, 2004
Summary: 3dfx's Voodoo2 SLI was the pinnacle of graphics performance when it was released. The premise was simple, combine two Voodoo2 boards for twice the pixel-pushing horsepower. Since then, competing technologies have attempted to follow-up SLI but have fallen short in one area or another. Now NVIDIA is back with its PCI Express-based scalable link interface, and the concept looks like a winner. Read all about it in this article!
| Introduction||Page:: ( 1 / 5 )|
The feature that really caught everyone’s eye with Voodoo2 however was its support for SLI, scan-line interleaving. By combining two Voodoo2 graphics cards together, performance nearly doubled, and a new resolution was opened up: 1024x768.
SLI worked by splitting the workload in half, one Voodoo2 card rendered the even lines on the screen, while the second card rendered the odd lines. The cards were linked by a pass-through cable with the only requirement being that the cards were of the same configuration and manufacturer.
Critics touted SLI as being impractical. Voodoo2 boards were initially tough to find and priced at $300, well out of the budget of most consumers. Also keep in mind that Voodoo2 was a 3D-only solution, after forking over $600 for an SLI config, you still needed a third graphics card to handle 2D duties, this created a new super high-end segment of the 3D graphics market, replacing the $300 price point that was established with Voodoo Graphics and Voodoo2.
|<% print_image("02"); %>||<% print_image("03"); %>|
Despite all of this, the letters S-L-I really took off with gamers – even if you couldn’t afford it, you still knew what it was and one day hoped to pair your current Voodoo2 card with its twin. Hardcore gamers and hardware enthusiasts responded passionately to SLI, according to estimates, 30% of consumers picked up a second Voodoo2 card.
While 3dfx was cashing in on SLI, the rest of the industry was well into the transition from PCI to the faster AGP interface. Since none of the early AGP specifications have provided for dual graphics configurations, graphics manufacturers have had to come up with some pretty clever solutions to provide this capability.
Metabyte was first with their “Stepsister” technology, officially dubbed PGC (Parallel Graphics Configuration). In fact, we previewed this technology back in February 1999. PGC worked by splitting the workload into two pieces, the AGP card handled the top of the scene while the PCI card rendered the bottom, but it wasn’t a 50-50 split; since the AGP card offered a little more bandwidth, it could handle 60% or 70% of the work, while the PCI card took care of the rest. Unfortunately for Metabyte however, it took them awhile to get the technology perfected. By the time it was ready for the public, next-generation cards were released for the AGP interface only, no one was interested in a high-end PCI graphics card. PGC was also never entirely effective at load-balancing: the top segment of the screen isn’t always as complex as the lower half. PGC never truly addressed this. The technology was eventually acquired by Alienware and faded into obscurity.
| ATI/Alienware’s dual graphics products||Page:: ( 2 / 5 )|
The RAGE FURY MAXX
Metabyte wasn’t the only company to dabble with multi-card technology. ATI came up with their own approach, only they combined it into one card with the RAGE FURY MAXX. This solution solved the AGP/PCI bandwidth issue by integrating two RAGE 128 PRO graphics cores onto one card, but they still needed to come up with a way to divide the work among the two chips.
What ATI came up with is Alternate Frame Rendering (AFR). Rather than splitting the work up every frame like in SLI, with AFR, each RAGE FURY 128 PRO graphics core handles alternate frames. Each chip renders every other frame instead of portions of the same frame.
ATI’s approach was well implemented, initially there were concerns of lag in certain fast-paced shooter games like Quake 3, but in practice the technology worked as advertised for the most part (there were visible artifacts in a few applications). RAGE FURY MAXX’s biggest problem wasn’t related to AFR, rather it was the anemic performance of the underlying RAGE 128 PRO graphics the card was based on. Complicating matters, RAGE FURY MAXX was behind the curve technology-wise as well, lacking hardware transformation and lighting (T&L) support. This contributed to the card’s disappointing performance. Fortunately, the one positive ATI took from RAGE FURY MAXX was that the AFR technology it was based on could be applied to future products, in fact many expected ATI to follow up the initial RAGE FURY MAXX with an AFR-based product with hardware T&L support.
Ultimately this never happened though, the additional costs of adding a second graphics core and its corresponding memory were just too great to make a follow-up product feasible. (While the RAGE FURY MAXX was technically a 64MB board, the memory was split evenly among both chips for an effective 32MB.) Even Sapphire has attempted to do this, most recently with a dual R350 board they showed us at Computex last Fall, driver support ultimately killed that project.
In short, no product ever came close to matching 3dfx’s achievement with Voodoo2 SLI.
PCI Express transition/Alienware
For PCI Express, Intel sought to overcome the chief limitations of both PCI and AGP. PCI Express’ improved bandwidth and hot-swap capability are well documented features, but the new bus is also capable of supporting more than one graphics card.
Alienware was the first company to take advantage of this feature with their video array technology, which was first announced at E3 a few months ago. Like Metabyte’s PGC, video array splits the scene into two pieces; one graphics card takes the upper portion of the environment, while the second card renders the lower section of the scene. Alienware’s video merger hub (which is a third card), is responsible for managing the workload between the two cards, no custom driver is required for operation.
Alienware has been somewhat vague on pricing and configuration data, only to say that X2 will be offered solely in their new line of ALX gaming systems and won’t ship until Q3 or Q4 of this year. Alienware gaming rigs tend to be priced higher than many consumers can afford, with top-of-the-line systems often selling for over $3,000, so don’t expect X2 to come cheap.
But fortunately, Alienware isn’t the only firm that plans to take advantage of PCI Express’ dual graphics capability. NVIDIA is also poised to bring a dual PCI Express solution of their own, and it has a familiar name: SLI!
| What is SLI?||Page:: ( 3 / 5 )|
NVIDIA’s Scaleable Link Interface
SLI works similarly to the rendering techniques employed by Alienware for their X2 video array system and Metabyte’s PGC technology in the sense that the graphics cards split the workload horizontally across the screen. One card takes the upper portion while the second card takes the lower segment.
It’s important to note that the screen isn’t necessarily split exactly in half. As we discussed earlier, in some scenes the lower portion of the screen may be more complex than the upper portion, or vice versa. SLI is designed to take this into account, splitting the work properly between both cards.
|<% print_image("04"); %>||<% print_image("05"); %>||<% print_image("06"); %>|
Sounds simple, but exactly how does it work?
Once two PCI Express cards are installed in the system, NVIDIA SLI kicks in. Since Intel’s Tumwater Xeon chipset (the only known Intel chipset with plans for dual PCI Express motherboards) only provides for a total of 24 lanes, one card operates in the full x16 PCI Express configuration, this card is designated as the “master”, while the second PCI Express card runs in x8 mode, this card becomes the “slave”.
|<% print_image("07"); %>||<% print_image("08"); %>|
The graphics driver then determines the workload for both cards depending on the scene. NVIDIA has developed their own patent-pending dynamic load balancing algorithms. These algorithms are crucial to ensuring that the workload is split appropriately among both cards, with the master card getting a little more work than the slave board. The master card is also responsible for outputting the final image to your screen.
|<% print_image("09"); %>||<% print_image("10"); %>|
Both cards are linked together via the multi-purpose I/O (MIO) port, and physically connected to each other via an SLI connector. Unlike 3dfx’s pass through cable which was analog, NVIDIA’s SLI connection is completely digital. This ensures that the data sent between the two cards is consistent; sometimes color data didn’t match correctly with Voodoo2 SLI, tearing would also occur occasionally. We weren’t given a number for the peak bandwidth of the link between the two cards, but the figure must be massive to ensure optimal performance.
|<% print_image("11"); %>||<% print_image("12"); %>|
NVIDIA’s SLI isn’t as simple as developing a special SLI connector and throwing a driver around it all either. NVIDIA had to come up with their own inter-GPU communication protocol and dedicated scalability logic as well. You can see this in the following die shot, taken from an NV40 (GeForce 6800) GPU:
| SLI support, segmentation, and performance||Page:: ( 4 / 5 )|
Unfortunately, not every PCI Express NVIDIA graphics card will take advantage of SLI. NVIDIA’s GeForce PCX line doesn’t support this new technology, GeForce 6800 is the first to provide this capability.
|<% print_image("14"); %>||<% print_image("15"); %>|
NVIDIA is focusing on the $299 and up segment of the 3D market. There are no current plans to bring SLI to the mainstream and value segments. This ensures that you can’t purchase two $200 mainstream GeForce 6 boards to equal the capability of one $400 GeForce 6800 GT or a $500 GeForce 6800 Ultra. Additionally, another requirement of SLI is that both cards must come from the same manufacturer and be based on the same configuration. For example, you can’t combine a 128MB GeForce 6800 with a 256MB GeForce 6800 board, or a GeForce 6800 GT with a GeForce 6800 Ultra.
Besides the GeForce 6800 line, NVIDIA also plans to bring SLI to the workstation market with its line of PCI Express-based Quadro cards. The concept is the same, only you’re dealing with Quadro FX 4000 instead of the GeForce 6800 series.
|<% print_image("16"); %>||<% print_image("17"); %>|
|<% print_image("18"); %>||<% print_image("19"); %>|
Other important details
SLI is completely invisible to the end user; no special driver is required for operation. In fact, SLI users will download the same driver as every other NVIDIA user. Likewise, games don’t have to be programmed to take advantage of SLI, but some games will see larger gains than others. NVIDIA has witnessed improvements of up to 1.87 times over a single graphics card in 3DMark 03 (tested at 1600x1200 with 4xAA and 8xAF in game tests 2, 3, and 4) and Epic’s Unreal Engine 3 at 1024x768. We wouldn’t be surprised if flight simulation titles, which tend to be more platform-bound than other genres, saw lower performance improvements.
To further improve performance, NVIDIA will also be seeding developers with hardware and providing their SLI algorithms to programmers so they can code for it, but NVIDIA made it clear that you won’t see an exact 2x performance improvement, at least with first generation hardware.
It’s also important to keep in mind that NVIDIA SLI isn’t Alienware’s previously announced X2 video array technology. While the underlying technique is somewhat similar, the execution is different. Alienware’s X2 requires a third external card, the video merger hub, to communicate between both cards, and works for both ATI and NVIDIA hardware, including the GeForce PCX 5900.
NVIDIA is offering its SLI technology to all of its customers, including Alienware, so there’s nothing stopping Alienware from adopting the technology in their own systems.
| Final Thoughts||Page:: ( 5 / 5 )|
Another question mark we have revolves around the SLI connector. As of press time, NVIDIA hasn’t announced their plans for bundling this device. Hopefully it will ship with PCI Express GeForce 6800 boards once they hit retail, but it’s also possible that you’ll have to purchase it as a separate accessory. Considering the success of 3dfx’s “The Power of Two” campaign, we’re hoping NVIDIA bundles the connector with the card, just as 3dfx did with the pass-through cable on Voodoo2, even if that means the board ships with an extra pamphlet or two full of SLI marketing material, but the final decision is still up in the air.
NVIDIA’s SLI technology could dramatically spice up the 3D market. With GeForce 6800 already CPU-limited in many situations, gamers could instead potentially crank up the AA to 8x and the AF to 16x without a dramatic performance hit. Meanwhile, next generation titles could be played with their maximum settings and at high resolutions. The possibilities are limitless.
The real question mark will be infrastructure support. How many motherboard manufacturers will provide dual PCI Express motherboards and in what quantities and pricing? What are the power requirements going to look like? These are the types of questions that really haven’t been adequately answered. NVIDIA initially plans to control the situation by focusing on system builders first, but in order for SLI technology to really take off, it also needs to hit retail.
Until that occurs, NVIDIA will be demonstrating SLI at the Electronic Sports World Cup, Fragapalooza, and the CPL Championships next month, with SIGGRAPH and QuakeCon demonstrations in August. If you’re in the area of any of these events, you may want to check it out.
In any case, we’re anxiously awaiting the arrival of SLI. If NVIDIA is able to deliver as promised, NVIDIA SLI should give gamers a compelling reason to upgrade to PCI Express, and they’ll be sure that PCI Express card (or cards) is based on an NVIDIA GeForce 6800 series GPU!