FiringSquad: Home of the Hardcore Gamer - Games, Hardware, Reviews and NewsSubmit your own or view users' CPU overclocking results!

  
 Home   News   THE MATRIX   Deals   Hardware   Games   Features   Media   Products   Forums   FS China 
AddThis Social Bookmark Button

Home : Hardware : CPUs : Exclusive Athlon 600 Preview
» Join the Greatest Gaming Community NOW! (It's free)

Already a member? Login
 



Random Gallery >> 
Click to view high-res Image!
Leaked Assassin's Creed 3 Screenshots [6] (0)

Crank That S#!t Up! ENTRY :) (2) by CamoDaGreat
HOW I CRANK THAT S#!T UP!! (4) by nvidia4life
Crankin' it up today... and tomorrow! (8) by Slipdisk
My First Entry For Crank That S#!T Up! (2) by deathknight.92
[FX] 3-Screen Effect - Guide (part-3) (0) by nGAGE
My Crank that S#!t up entry :) (15) by ZEZgames
The Nvidia "Crank That S#!T Up" Quiz Show! (21) by mohawkade
My crank that S#!T up entry (9) by iamcj
My First Video (3) by Stryker
[FX] 3-Screen Effect - Guide (part-4) (0) by nGAGE

More Blogs >>




Exclusive Athlon 600 Preview
July 22, 1999   Kenn Hwang > [View My Other Articles]
Product Info | User Reviews | Article Images(10) | Image Gallery | Comments | Forum Thread
Instruction Prefix

Why is the Athlon a speed champ?

So now let's look at what any self-respecting next-generation technology needs to make a fast CPU. First and foremost, we need an efficient (and fast) way to decode slow x86 instructions into a fast, efficient RISC-like set. The Athlon has the ability to decode up to 6 CISC instructions per clock cycle.

This is nothing new. CISC to RISC had been commonly implemented back in late 1995, with Intel's introduction of the Pentium Pro (and with NexGen's NX586!). Beyond the more esoteric talk of "superpipelining," "dual independent bus" and "dynamic execution" talk, a major distinction between the Pro and the original Pentium CPU was the fact that it would convert long, complex CISC operations to easier-to-manage RISC micro-operations, or micro-ops.

CISC has its benefits

The main benefit of converting to RISC is uniform instruction length. As its name implies, CISC instructions tend to be long,variable-length affairs. Moving to RISC allows a CPU to more efficiently sort and schedule operations, especially out-of-order ops to individual execution units. There are drawbacks, however. First and foremost, decoding to reduced instruction set incurs a performance hit, and is probably the single biggest bottleneck in current processors. The faster your CPU can decode, the faster it will run.

The important thing to note is that the exact number of decodes depends on what kind of code is pumped through the processor. Just as some code is particularly suited (or optimized) for 3DNow!, certain types of code require fewer or more cycles to decode. Realistically, it looks like the Athlon will be able to decode between 2.5 to 3 real-world instructions per clock cycle.

Instruction Execution

Once x86 instructions have been "decoded" to RISC, they have to be buffered before they can be executed by the CPU. Instruction operands are handled by "execution units." This is where you hear a lot of low-level microprocessor buzzwords. High-performance CPUs such as the K6-3, Athlon, and Pentium III (all the way back to the Pentium Pro in fact) make use of multiple execution units for parallel processing of certain operations.

This is what is known as a Superscaler instruction pipeline. The Athlon has a 9-issue superscaler architecture for its general functions. They can be broken down into 3 pipes for integer operations, 3 floating point units, and 3 addressing units. Compare this to the 2-issue pipelining on the Pentium II processor, which executes in 12 stages.

Buffering Data

With so many execution units, it would seem difficult for the processor to keep each execution unit filled and fully utilized. While this usually isn't the limiting factor in determining CPU performance, your processor won't be achieving its full potential if its execution units aren't being kept full. Think of it like a 6-lane freeway road. It can sustain a continuous single lane of cars, but it can also handle 6 lanes of continuous traffic just as well, and in the latter case, you're getting a lot more processing accomplished.

This is where L1 cache and internal buffering come into play. The Athlon contains 72 buffers to store RISC86 micro-ops waiting to be fed to its 9 execution units. By pushing instructions into the buffer, the Athlon should be able to keep its pipelines full and churning at full speed.

Back! Details about the Athlon's architecture     I want my FPU Next!
Blog + Share: Digg Del.icio.us Reddit SU furl • More: AddThis Social Bookmark Button
Send This Article to a Friend!  
Table of Contents
  Print Entire Article  

MATRIX CONTENT » RANDOM MEDIA BLOG More Blogs >>
No ratings yet
» Please rate this
Read this Media-Blog entry!» My Crank That Sh#!t Up! entry :D (3)
by chipmunk995 (5) Talk with this user on their Shout Box (My other blogs) Posted 22 months ago


 Hottest Topics
Diablo 3 sells 6.3 million, Blizzard claims PC record (6)
New Firefall dev diary talks HUGE gameplay changes (2)
Red Orchestra 2 dev diary details free content update including new game modes, map, other improvements (1)
Tribes: Ascend Update #4 video, new Brute weapons (1)
Reports of Diablo 3 hacks (lost items/gold) surface (1)
Today's News >>
Today's Siteseeing >>


 Table of Contents


 Quick Fact
VLIW: Very Long Instruction Word

EPIC: Explicitly parallel instruction computing


FiringSquad is powered by... Back to Top Site MapContact UsAdvertise With Us Privacy StatementAbout Us  
News RSSSiteseeing RSSArticle RSS   © 1998-2012 FS Media, Inc. All Rights Reserved