[ Print Article! ]

Intel Core 2 Extreme QX9650 Penryn Performance Preview
October 28, 2007 Brandon Sandman Bell

Summary: Thanks to its new 45-nm manufacturing process, you won’t believe how far we OC’ed this processor! Oh, and there’s also quad-processing cores with a 1333MHz FSB, 12MB L2 cache, and SSE4. Read on to see how Intel's latest CPU performs in upcoming games like Crysis and Call of Duty 4!


Intel Core 2 Extreme QX9650 Penryn Performance PreviewPage:: ( 1 / 14 )

[image]

<% print_image("01"); %><% print_image("02"); %>

While NVIDIA’s GeForce 8800 GPU definitely comes close, you can make an argument that Intel’s Core 2 CPU was the most significant new hardware release in 2006.

Core 2 was largely designed around Intel’s mobile Pentium M “Yonah” CPU core, with several new performance enhancements. The chip featured a wider execution core, allowing the processor to complete up to four full instructions simultaneously (previous Pentium D CPUs were limited to just three simultaneous instructions), and Core 2 also featured a 14-stage pipeline, allowing the CPU to perform more work per clock cycle.

If you recall, this was one of the chief weaknesses in Core 2’s predecessor, Pentium 4/D. Previous Pentium processors sacrificed the amount of work performed per clock for more pipeline stages, 31 in the case of latter Pentium D processors. As we all know by now, this design decision ultimately came back to haunt Intel when Pentium had trouble scaling to higher clock speeds…

For increased efficiency, Core 2 utilized a single, unified L2 cache, while more advanced prefetchers in the L1 and L2 caches were added along with new cache prefetch algorithms to help hide memory latency and thus improve the effectiveness of the L2 cache. To further spice up the package, Core 2 also boasted improved performance when dealing with SSE, SSE2, and SSE3 instructions.

As a result of all these changes, Core 2 was not only considerably faster than Intel’s previous Pentium processor, it also significantly outperformed AMD’s fastest Athlon 64 X2 and FX processors, all while generating very little power. It truly was a breakthrough product that shook up the entire PC industry.

And now it’s time for Intel’s engineers to give Core 2 its midlife upgrade – just in time for the company to put a damper on AMD’s upcoming quad-core Phenom launch…


Introducing Penryn: the next-generation of Core 2 Processors

As you probably know by now, Penryn comprises Intel’s family of processors based on their new 45-nm manufacturing process. The smaller process allows Intel to cram more transistors into the processor’s die without significantly increasing its size. According to Intel, the new 45-nm high-k process gives them twice the transistor budget, this allows them to add performance enhancing features such as larger L2 caches while still delivering a cost effective die size. For example, a dual-core Penryn chip boasts a die size of 107mm² with 410 million transistors; in comparison today’s Core 2 chips cram 291 million transistors into a 143mm² die. Of course the other appeal of the smaller process
to enthusiasts who overclock is lower power: Intel notes a 30% reduction in transistor switching power between 65-nm and 45-nm. With lower power requirements also comes less heat generated by the CPU, resulting in a cooler-running PC.

Penryn is more than just a die shrink though. Intel has incorporated a number of architectural enhancements into Penryn that are designed to deliver clock-for-clock performance enhancements over today’s Core 2 CPUs at a given clock speed.

Fast Radix-16 divider: One key new technology Intel has incorporated into Penryn is their Fast Radix-16 divider. Intel’s Radix-16 divider is a new divider technique providing double the divider speed over previous processors when handling math computations (both floating-point and integer operations): 4-bits processed per cycle in Penryn versus 2-bits per cycle in today’s processors.

SSE4: Penryn will also support Intel’s new SSE4 instruction set. The majority of the new instructions are focused on compiler optimizations, but Intel has also added a number of “application targeted accelerators” which are hard-coded onto the processor’s die to improve performance in gaming, video encoding, 3D rendering, and photo imaging apps (provided that the software has been coded to use the new instructions of course).

Super Shuffle Engine: Penryn incorporates a 128-bit wide, single-pass shuffle unit. This allows it to perform full-width shuffles in a single cycle. The new shuffle unit will also improve Penryn’s performance with SSE2, SSE3, and SSE4 instructions that have shuffle-like operations.

Improved Virtualization: Penryn also features Intel’s enhanced virtualization technology. Intel claims virtual machine transition times have been improved from 25-75% with Penryn.

Larger L2 cache: Penryn processors will feature a considerably larger, more associative L2 cache. Dual-core Penryn CPUs will ship with up to 6MB of L2 cache while quad-core processors will contain up to 12MB of L2 cache. In comparison, today’s dual-core Core 2 CPUs ship with 4MB of L2 cache, while quad-core chips contain 8MB.

These larger caches help improve performance by increasing the probability that each execution core can access data from the processor’s L2 cache rather than having to get it from slower system memory.

Faster Clock Speeds/FSB: Intel’s already bumped the front-side bus (FSB) speed up to 1333MHz; Penryn will crank this up another notch, ultimately scaling all the way up to 1.6GHz. Penryn CPUs will also boast higher clock speeds. Speeds of 3.0GHz and up are expected.

In the coming months, Intel will be introducing several Penryn derivatives for the mobile, server, and desktop segments of the PC market. Today we’re going to be focusing on Intel’s latest enthusiast CPU for the desktop segment: the Core 2 Extreme QX9650.



Core 2 Extreme QX950 Up ClosePage:: ( 2 / 14 )

Introducing Yorkfield: the first 45-nm Penryn Core 2 CPU

Intel’s first Penryn variant on the desktop will be the quad-core Core 2 Extreme QX9650, previously referred to by its codename “Yorkfield”, which will be officially branded as the Core 2 Extreme QX9650 when it’s sold in a matter of weeks on November 12th. With its large, 12MB L2 cache the chip contains 820 million transistors and runs at 3.0GHz. Early next year Intel will roll out their dual-core equivalent to Yorkfield codenamed “Wolfdale”.

Like Intel’s current quad-core processors, Yorkfield essentially consists of two dual-core Wolfdale processors which have been grafted together onto the same package and linked together by the FSB. Unfortunately we don’t have a picture of Penryn without its head spreader attached, but this photo of Intel’s quad-core Kentsfield core from last year illustrates how the two CPU dies are situated nicely:

[image]

<% print_image("03"); %>

With the heat spreader in place, the Core 2 Extreme QX9650 looks just like any other Core 2 CPU:

[image]
<% print_image("04"); %><% print_image("05"); %>

Here are the specs on the Core 2 Extreme QX9650:
  • 1333MHz FSB
  • 3.0GHz clock speed with four processing cores (quad-core)
  • 12MB L2 cache (2x6MB)
  • 45nm high-K metal gate transistor technology
  • 214mm2 die size, 820M transistors
  • SSE4 instructions
  • 130W Thermal Design Power, C-Stepping
  • Overspeed protection (clock multiplier) removed
  • Enhanced Intel SpeedStep Technology (EIST)
  • Intel 64 Technology
  • Intel Virtualization Technology (VT)
  • Supports Execute Disable Bit (XD)
  • LGA-775 socket interface


    Motherboard Compatibility

    In order to run a Penryn processor you’ll need a motherboard based on the P35 or X38 chipsets from Intel; NVIDIA’s nForce 680i SLI chipsets are also 100% compatible with Penryn.

    If you’re running a P35 or nForce 680i SLI motherboard you will need to update the BIOS on your motherboard to the latest version. In the past few weeks most motherboard manufacturers with nForce 680i and P35 motherboards have quietly released new BIOS revisions with proper support for Penryn CPUs. It also wouldn’t hurt for X38 owners to ensure that their motherboard’s BIOS supports Penryn as well.

    Once your BIOS is flashed and set properly, you should be good to go.

    CPU Cooling

    In terms of cooling, the Core 2 Extreme QX9650 operates much cooler than its predecessor, the Core 2 Extreme QX6850. In our testing the QX9650 hits anywhere from 25-29 degrees Celsius at load depending on the application you’re running. Running those same apps with our Scythe Ninja cooler and Arctic Silver 5 the QX6850 temps ranged between 40-45 degrees Celsius.

    Even when overclocked, the QX9650 ran cooler than the 65-nm QX6850 CPU running at stock speeds!

    Overclocking

    Overclocking is one “feature” previous Core 2 CPUs have excelled at; our CPU Overclocking Database is filled with entries from FiringSquad readers who have managed to overclock their CPU to speeds of 50% or more with no problems!

    With its 45-nm manufacturing process, we were eager to see how far we could push our Core 2 Extreme QX9650. Fortunately it didn’t disappoint us. Check out the following screenshot:

    [image]

    <% print_image("06"); %>

    Keep in mind we accomplished this speed with the aforementioned Scythe Ninja CPU cooler and we’re running in an open air environment outside the case. We’ve included full benchmarks at 4.1GHz later in this review…



    System SetupPage:: ( 3 / 14 )

    System Setup


    AMD Athlon 64 X2 6000+ (3.0GHz)

    Intel Core 2 Extreme QX9650 (3.0GHz)
    Intel Core 2 Extreme QX6850 (3.0GHz)
    Intel Core 2 Extreme QX6700 (2.67GHz)
    Intel Core 2 Duo E6750 (2.66GHz)
    Intel Core 2 Duo E6700 (2.67GHz)

    ASUS M2N32-SLI Premium
    ASUS P5E3 Deluxe Wi-Fi AP

    2GB Corsair TWIN2X2048-6400C3
    2GB OCZ DDR3-1800 Platinum Series

    EVGA GeForce 8800 Ultra w/ForceWare 167.37

    Windows XP Professional with Service Pack 2

    DirectX 9.0c


    Benchmarks

    LAME MT MP3 Encoding (MS Compiler)
    DivX Converter
    Windows Media Encoder 9
    F.E.A.R. 1.08
    Supreme Commander 1.1.3269
    Enemy Territor Quake Wars
    Oblivion
    Company of Heroes 1.71
    Crysis Demo
    Call of Duty 4 Demo
    Lost Planet
    Half-Life 2 Episode Two
    Cinebench 10




    Media encoding and renderingPage:: ( 4 / 14 )

    Microsoft Windows Media Encoder 9



    LAME MT MP3 Encoding



    Cinebench 10



    Valve Particle Simulation benchmark



    Notes

    In our conventional media encoding and rendering tests, the added cache present in the Core 2 Extreme QX9650 allows it to shave some time off common tasks such as encoding a 200MB 1080p WMV-HD file with Windows Media Encoder, or a 200MB WAV file into a 128 bit rate MP3. The most substantial gain we saw though was in Cinebench 10, where the Core 2 Extreme QX9650 ran 8% faster than the QX6850. However, there is one SSE4 app

    SSE4 Testing



    To see the impact SSE4 can have on performance, we loaded up the latest build of VirtualDub and DivX. In this test we use VirtualDub to convert a 300MB file into DivX. The beauty of this test is we can toggle back and forth between SSE2 and SSE4 with the Core 2 Extreme QX9650, allowing us to see the potential of what these new instructions can bring when the app is coded properly. As you can see, the results are pretty substantial, shaving a full minute off our time when SSE4 is enabled: 3 minutes, 11 seconds versus 4 minutes and twenty seconds! You can also compare the Yorkfield SSE2 results with the Kentsfield QX6850 results to see how much of an impact the larger cache plays on performance.



    Supreme CommanderPage:: ( 5 / 14 )

    Supreme Commander







    Notes

    We’ve never been big fans of benchmarking with Supreme Commander, the built-in tests just don’t do a good job of reflecting the performance impact of quad-core CPUs, something which this game definitely can take advantage of. You’d never know that based on the built-in results presented above though. We’ve passed along some suggestions to Gas Powered on how the built-in test can be improved, hopefully one of these days they can get it tweaked properly.



    Call of Duty 4Page:: ( 6 / 14 )

    Call of Duty 4





    Notes

    We honestly weren’t expecting the quad-core CPUs to show any improvements over their dual-core counterparts in Call of Duty 4, then we remembered that Infinity Ward worked closely with Intel a few years back to bring dual-threading support to Call of Duty 2. Based on our results here today it looks like Call of Duty 4 is definitely multithreaded: at 800x600 the Core 2 QX6700 ran 14% faster than the E6700! Keep in mind that we’re testing this game with FRAPS and the scene that we use for testing changes from run-to-run, so in one run an RPG may land in front of you, whereas in the second run it will fly harmlessly over your head. Friendly and enemy AI performs differently each time as well.

    To help mitigate this problem we run five runs per resolution, but the margin of error is still higher than we’d normally like, around 5% generally, so subtract 5% from the 14% figure above and that would be a more realistic figure.

    The Core 2 Extreme QX9650 delivered a strong performance here, running 12% faster than the QX6850 in our testing. This can all be attributed to the larger cache, as none of the games on the market today take advantage of SSE4.

    As always, once you crank up the screen resolution and turn on AA/AF the GPU becomes a bottleneck and as a result, all the systems tested here perform the same.



    Company of HeroesPage:: ( 7 / 14 )

    Company of Heroes





    Notes

    The Core 2 Extreme QX9650 put up a strong showing in our tests with Company of Heroes, running 16% faster than the QX6850 at 800x600.



    Lost PlanetPage:: ( 8 / 14 )

    Lost Planet





    Notes

    Lost Planet is one of the few first-person shooters that has been designed with multi-core in mind. By changing the “concurrent operations” setting from 2 to 4 for quad-core CPUs, you can see how the quad-core CPUs scale in the Lost Planet cave demo, which consists of hundreds of creatures flying randomly across the scene. The CPU is responsible for handling the AI routines for each one of them.

    The QX6700 runs about 7% faster than the E6700, while the Core 2 Extreme QX9650 ran about 9% faster than the Kentsfield QX6850 in this benchmark.



    Half-Life 2 Episode TwoPage:: ( 9 / 14 )





    Notes

    Episode Two is another FPS that has been designed for multi-core. In our demo sequence we use the gravity gun to fling objects at explosive barrels, Alyx is busy engaging them as well. All these physics calculations obviously pushed our quad-core CPUs, but unfortunately on Thursday night Valve patched the game to version 12 and thus made our demo obsolete, so we were unable to test the Athlon 64 X2 6000+ with the game.

    In any case, our results give a 4% edge to the quad-core QX6700 over the E6700 CPU, while the Core 2 Extreme QX9650 outperformed the QX6850 by 13% in our tests.



    FEARPage:: ( 10 / 14 )





    Notes

    The Core 2 Extreme QX9650 doesn’t seem to scale as well in F.E.A.R., at least in comparison to games like Company of Heroes and Episode Two. Still, at 800x600 we saw a 9% performance improvement over the QX6850.



    CrysisPage:: ( 11 / 14 )







    Notes

    For our Crysis testing, we’re using the built-in CPU test that ships with the game’s demo. In this test rockets are lobbed at buildings and other objects in the island level. As a result, debris is flying around everywhere. All these physics calculations really push the CPU, which is why we were eager to see how the QX9650 performed.

    The Core 2 Extreme QX9650 ran 11% faster than the QX6850 in our testing with Crysis. That’s right in the middle of the other apps we tested with. We also saw a slight gain of 2% when going from a dual-core to quad-core CPU.



    World in ConflictPage:: ( 12 / 14 )







    Overclocked Performance and Power ConsumptionPage:: ( 13 / 14 )











    Power Consumption




    Notes

    To really see how far we could push the performance envelope, not only is the Core 2 Extreme CPU overclocked to 4.1GHz, but we’ve also cranked up the DDR3 speeds to 1800MHz (with higher latency of course). We were so giddy about hitting these speeds that we’re debating about doing a full article with more benchmarks, but we’ve got quite a few new items on the testbeds nowadays…



    ConclusionPage:: ( 14 / 14 )


    Performance improves even more dramatically in apps that are SSE4 enabled. If you haven’t checked out our performance tests with VirtualDub and DivX on page 4, we highly suggest you take a look at those scores on the bottom of the page.

    Meanwhile, enthusiasts who overclock will love Penryn’s scaling potential. In our testing, the Core 2 Extreme QX9650 chip ran 15-20 degrees cooler than its predecessor, the Core 2 Extreme QX6850. We have a feeling this really helped us when it came to overclocking the processor.

    With the Core 2 Extreme QX9650 Intel has basically established new benchmarks in performance and power consumption: the performance per watt of this chip is simply through the roof! This is easily the best processor in the world right now, bar none.

    All this performance certainly won’t come cheap though. As an Extreme CPU, we would be shocked if the Core 2 Extreme QX9650 sold for anything less than $999, which is the price Intel has sold Extreme processors for several years now. Spending $1,000 on a product that’s going to be obsolete in a matter of months is never a wise investment. We think the majority of our readers would be best served by waiting for the prices to come down a little. Remember, earlier this year the Core 2 Q6600 sold for $851. Today you can find the processor selling for under $300!

    Now we eagerly await AMD’s counter to Penryn – Phenom. Intel has set the bar high with the Core 2 Extreme QX9650, so AMD will have their work cut out for them, but we have a strong feeling that they’re up for the challenge. Of course, based on how easily we were able to OC our QX9650 chip we wouldn’t be surprised if Intel countered with another Yorkfield part, but we’ll just have to wait and see how everything plays out…


  • © Copyright 2003 FS Media, Inc.
    [ Print Article! | Close Window ]