Apple jumps on the SIMD Bandwagon
No one can be told what the Matrix is.
I mentioned before how the AltiVec execution unit gave the G4 the bragging rights of 1 gigaFLOP of performance. How does this work? Like 3d-Now or SSE, AltiVec is a language that allows for short vector processing. While the integer units can only handle 32 bits of information at a time, and the floating point unit 64, the AltiVec unit can perform operations on 128 bits of data at a time. This data can be broken up in three ways: as 16 8-bit numbers, 8 16-bit numbers, or 4 32-bit numbers. The G4 contains a 128-bit internal memory path There are 162 AltiVec instructions, many more than any x86 SIMD architecture offers. The instructions range from basic vector arithmetic (adding/subtracting, dot products, cross products, scalar multiplication) to complicated linear algebra functions like permutations.
On a side note: The irony of Apple adding SIMD instructions should be noted by PC and Mac fans alike. Several years ago, when Intel added MMX to its Pentiums, Mac zealots worldwide started screaming about the complexity MMX represented and how it outlined the differing design philosophies of CISC and RISC. Now, several years later, Apple is touting the much more complex "Velocity Engine" as the biggest thing since the transistor, and Mac fans on the net are gobbling it up with little sign of ironic acknowledgement. Also, Intel is now making chips that internally run RISC instructions, and the Merced is a huge jump in that direction. It goes to show you: some things never change, and sometimes compromise is the best engineering decision.
While a physicist looking to accelerate his large matrix multiplies (something common in quantum mechanics) may be jumping with joy at this point, most Firingsquad readers are probably asking what short vector algebra has to do with Quake 3. The answer is: everything. If you ever have the opportunity to take a university class in computer graphics, the professor will let you into a little secret. That secret is that just about everything in 3D graphics can be done with a matrix. Not only is it possible to do it with a matrix, that's the usually the easiest way to do it. Rotating an object in the world co-ordinate system? Use a matrix! Clipping polygons to the view frustum (that's the 3D area that is visible to the camera)? Try a matrix. Simulating a pinhole camera to turn 3D into 2D? You guessed it: it's matrix time! Filtering an image to blend texture maps? Hmm, take a wild guess.
MMX? No thank you!
Since matrices are made of vectors, almost every matrix function can be converted into a vector function. It is these operations that AltiVec, as well as other SIMD solutions, aim to accelerate. If properly utilized, AltiVec can quadruple the speed of certain applications. Some single AltiVec instructions can do in three cycles what would take 20 floating-point cycles. Of course, software must be specifically written to take advantage of this speed up. This is one area where Motorola has learned from the mistakes of others. The adoption of MMX was severely hampered by several facts. First, the MMX engine was tied into the floating-point unit, making it impossible to do floating point and MMX calculations at once.
Intel fixed this problem with the Pentium III's SSE unit. Secondly, Intel did not immediately released programming tools that could take advantage of MMX. Any developer who wanted to take advantage of MMX would have to edit the code in assembly, a task that makes any programmer cringe (especially on the complicated x86 architecture). Motorola has been supplying C/C++ libraries of vector functions on it's website for months. These functions have already been tuned to use AltiVec, and some modifications don't require anything more than renaming some functions and pushing the recompile button. As a result, AltiVec has many more early adopters than MMX did.