Inside The Phenom
Quad-Core Dissected
AMD knows what victory tastes like. The original Athlon launch saw Intel scrambling for a foothold and the Athlon 64 unveiling reaffirmed the green team’s dominance. However, Intel proved that it wasn’t the type of company to be held down when it showed off its Core micro-architecture—a design with roots in the mobile world and then polished for the desktop and enterprise spaces.
Ever since then, it has been AMD on the defensive. With Phenom, the company looked like it was going for Intel’s jugular. Built on a 65nm manufacturing process, AMD finessed four processing cores onto a single die, yielding what AMD likes to call native quad-core. In contrast, Intel’s quad-core chips leverage a pair of dual-core processors sharing real estate on the same package. According to AMD, its native approach gives you the benefit of communication between cores at full die speeds. The lingering question, of course, is whether or not a native quad-core design can save the day when scaling issues are hampering headroom.
Each of the four cores comprising a Phenom processor has its own 128KB L1 cache and its own 512KB L2 cache. The four L2s empty out in a shared 2MB L3 repository responsible for holding data that gets flushed from L2 and might need to be used later. All of that cache, together with the processor’s quadrupled execution resources, gives the chip a transistor count in excess of 460 million.
While much of that complexity is attributable to cache memory, AMD has also made significant improvements to the processing cores themselves in order to help bolster performance. For instance, the engine responsible for handling SSE operations is now 128 bits wide instead of 64 bits, so now all SSE operations are executed in a single cycle. Similarly, instructing fetching increases from 16 bytes per cycle to 32 bytes. And data moves faster in and out of the L2 cache thanks to more bandwidth between cache and the northbridge.
All of AMD’s chips already enjoy a significant advantage when it comes to communicating with RAM. An integrated memory controller goes a long way to help minimize latency. Phenom’s memory controller is revamped to further speed up data transfers. Instead of the dual-channel, 128 bit controller the K8 architecture employed, Phenom splits the logic into a pair of 64-bit controllers operating more efficiently. When you look at the boot screen of your Phenom-based platform and wonder why it’s reporting 2GB of memory at 64-bits when you clearly dropped your 1GB modules into separate channels that should total 128, remember that the modules are running in a dual, unganged 64-bit arrangement. Don’t worry—that configuration is the one you’ll want to use for the best possible performance. On top of offering more granularity, Phenom’s memory controller also incorporates support for frequencies up to 1066 MHz.
The Phenom drops into a brand new socket interface that isn’t all that new after all. Socket AM2+ is pin-compatible with the AM2 interface already in use. It adds support for a HyperTransport 3.0 interface between the CPU and northbridge, pumping up frequency from 1 GHz DDR to 1.8 GHz DDR. The resulting boost to bandwidth helps enable the PCI Express 2.0 links you’ll find on most Phenom-based platforms. You can drop a Phenom chip into an older AM2 board (with a new BIOS, of course), but you won’t get those HyperTransport 3.0 link speeds. You can also drop an AM2 processor into an AM2+-equipped motherboard to the same effect. Optimally, though, you’d pair AM2+ chip to AM2+ motherboard. For our purposes today, we’re using AMD’s 790FX chipset, the flagship of the company’s core logic lineup.
Notice that AMD is swinging at Intel with a 65nm process when Intel’s already enjoying the fruits of a 45nm process. How is it possible for the Phenom to compete given an inherent disadvantage like that? Interestingly enough, each of a Phenom processor’s four cores is able to operate using independent clock speeds and voltages, continually optimizing for the load you’re putting on the chip. AMD tags the Phenom 9600 Black Edition with a 95W TDP, less than Intel’s Core 2 Quad Q6600—closest to the Phenom in terms of price.
Of course, we all know that tech specs and architecture are great for coming up with theoretical guesstimates of how a given chip should perform or compare against its competition. The rubber meets the road when you get down and dirty with the hardware in real-world benchmarks.