Branch Prediction
The brains in a computer
Without the ability to make decisions, computers would be no more than very fast calculators. The instructions that ask a processor to make a choice are called branch instructions. The reason for that is simple, each branch instruction represents a fork in the stream of execution. A branch instruction reads something like "If register 1 is less than register 2, go to line 0x00120000". Obviously, branch instructions occur frequently in many programs, and so their efficient handling can make a big difference in execution time.
Since the result returned by a branch instruction changes what instructions come after it, the branch poses the threat of keeping the entire processor idle while the information for the branch is fetched, the worst kind of pipeline stall. The simple way to handle the situation is, upon the reception of a branch command, to halt all other execution after the branch command, and to wait as many clock cycles as necessary for the processor to decide which direction to go. Unfortunately, this completely invalidates the gains of a pipelined processor when executing code with many branch decisions.
The solution that the G4 uses (as well as most other processors) is branch prediction. Every time the G4 hits a branch instruction, the processor looks at a table that records the results of however many times that particular branch was executed. From this table, the processor makes a guess as to which direction the branch will lead. It then starts to execute the instructions on that path, being very careful to not commit any changes to registers or memory. Meanwhile, the Branch Processing Unit decides which path the branch really takes. If the processor guessed correctly, then the instructions in the pipeline are completed and their results permanently committed. If not, the processor flushes all of the work in progress, backs up to the branch instruction, and begins executing down the correct path of instructions.
![The G4 Processor: Under the Hood [ Component Overlay of the G4 Processor @ 389 x 504 ] > View Full-Size in another window.](images/7400_overlay-s.jpg) Component Overlay of the G4 Processor
|
Driving with your mother
If this sounds complicated, it is. The best analogy to think of is driving a car with… say… your mother. Pretend you're driving to a party while your mother is reading a map and navigating. Every time you hit an intersection, you ask your mother which way to turn. Being in a hurry, you don't wait for your mother to look up from the map and answer you, instead you make an educated guess and drive in that direction. Eventually, sometime past the intersection, your mother gives you the correct answer. If she says turn right, and you had turned right, then you have saved whatever time you would have otherwise spent idling at the intersection.
However, if you were wrong, you then have to back up your car and go the other direction. The situation is better for a processor, however, since no matter how many instructions it has executed on an incorrect prediction, it can back up and start execution on the right path with only one extra clock cycle.
In real life programming situations, such as a loop, a branch instruction will repeatedly answer one way until the very last time. In those situations, branch prediction can greatly speed computation.