SSE Enhanced Games
The Pentium III doesn't even need SSE to succeed. Think about MMX. Even without a four-fold improvement in "multimedia applications," the Pentium II maintained Intel's dominance in the CPU market. The chip scaled beautifully to higher speeds, and that more than made up for the lack of lifesigns on the MMX front. Still, it would be nice to see some 3DNow!-like improvements to the Intel camp.
We had the opportunity to speak to some popular developers about the effects SSE will have on their respective games/software.
John Carmack, Lead Programmer, id Software on Quake 3: Arena
Most of the Katmai optimizations [for Quake 3] are in the OpenGL drivers. We may have some loops in the main code Katmai optimized, but it is a low priority. Because up to 75% of the execution time of the game is in the graphics driver, most of the burden of optimization is theirs. I know that Intel is working with ATI and Katmai on their drivers.
In theory, Katmai provides 4x the single precision floating point performance, but you would never see that on a real algorithm, let alone a full system level benchmark.
I believe that the driver guys are getting about a 25% total speedup with Katmai optimizations. Combined with the clock rate boost, that is a significant win.
A 25% performance increase is indeed a very good thing. Consider a game of Quake II running on an average gaming system at 40fps. A 25% boost could bring that up to 50fps, no trivial increase. What John Carmack says also fits in very well with the 3DNow! model - most of the optimizations from 3DNow! are actually in the OpenGL drivers. AMD engineers had some 3DNow! routines patched in the Quake2.exe as well, but most of those optimizations were in the internal software renderer.
Quake III may be unique case, however. As of now, very few other games give such
drastic improvements through 3DNow! optimization. You can read more feedback from developers
on the next page. Consider that the Quake games are very highly FPU-intensive. What this
means is that there was great benefit to optimize the games for the Pentium architecture's
pipelined FPU. Unfortunately, this meant that the K6 and K6-2's non-pipelined FPU performed
horribly in Quake II. When AMD implemented 3DNow! into Quake II, the "pipeline" bottleneck
was removed, and the K6-2 processor was able to slightly surpass the Pentium II in performance.
Now, what would happen if a game isn't FPU-intensive, or if the 3DNow! technology was used in the already strong FP-performing Pentium II? Not a whole lot. 3DNow! wouldn't be much of a factor since it can only crunch numbers slightly faster than a standard Pentium II. In fact, a theoretical P2 with 3DNow! would likely perform 5-10% faster, similar to how the K6-2 currently runs 3DNow!-enhanced Quake II. That should be the same case with SSE, where a huge win for a CPU with a "weak" floating point unit is only a marginal gain for an already-well optimized processor.
But as we all know, Quake isn't the only game in the world (it might just seem that way if you frequent our site). What do other game developers have to say about SSE?