Improving Performance
The Equation
The average performance of a processor can be summarized with a simple equation:
Cycles/Second * AverageInstructions/Cycle = AverageInstructions/Second
If an application is composed of roughly the same number of instructions on different architectures (which isn't always the case, but we'll assume so for now) then this equation is a good yardstick of performance. Unlike cycles/second (measured in MHz), which is defined for each processor, the average number of instructions completed each cycle varies according to the code being processed and the ability of the system to feed information to the CPU at a fast enough rate. That's why benchmarks are necessary.
![Next Generation High Performance Architectures [ Spec Graph @ 596 x 311 ] > View Full-Size in another window.](images/spec_graph-s.jpg) Spec Graph
|
The most important benchmark for CPU speed is the Standard Performance Evaluation Corporation (SPEC) battery of tests. The two most widely used tests are the SPECint95 and SPECfp95, which are the integer and floating point tests released in 1995 respectively.
SPEC is an independent association made of many hardware vendors to supply meaningful yardsticks of performance comprised of code that resembles real-life applications. The tests are relatively small and are designed to not be affected by any factors other than the speed of a CPU and its memory system. All official SPEC results are available from www.spec.org for your perusing.
Improving Performance
The most important single aspect of a CPU in a high-end workstation is the floating-point performance. The vast majority of calculations performed by a CAD, engineering, 3D modeling, video editing, or scientific application are floating point. Integer speed is important for many generic PC tasks, like business software, but floating point is much more important to high-performance users.
As the equation above shows, there are two ways to improve performance. The most obvious way is to increase the clock speed by engineering a chip in a way that minimized internal latencies and by using smaller wires and transistors on the silicon. By putting on-chip devices closer together, the problem of the finite speed of electricity (which doesn't really travel at the speed of light) is reduced. Increasing clock speed also increases the power input and heat output of a chip, two things that are never desired in a computer.
The second method of improving performance is taking advantage of Instruction Level Parallelism (ILP). The original microprocessors would take four or five cycles to complete an instruction; since this is the number of steps an instruction must to go through to reach completion. Pipelining reduced the number of steps to one or two cycles for an average instruction. It still takes five cycles to complete one instruction, but with a pipeline, the processor does not have to wait for one instruction to finish before starting work on the next one.