The graphics inside Microsoftís Xbox 360 chip breaks new ground in several new ways. The console sports a unified shader architecture, 10MB of embedded memory, and what ATI calls ď48 perfectly efficient shadersĒ among its long list of features. But we still had tons of questions.
In order to glean more about the graphics inside Xbox 360 and its architecture, we recently had the chance to speak with Bob Feldstein, ATIís VP of Engineering on Xbox 360:
We have 48 shaders. And each shader, every cycle can do 4 floating-point operations, so that gives you 196. Thereís a 192 number in there too, so Iím just going to digress a little bit. The 192 is actually in our intelligent memory, every cycle we have 192 processors in our embedded intelligent memory that do things like z, alpha, stencil. So there are two different numbers and theyíre kind of close to each other, which leads to some confusion.
Xbox 360 VPU block diagram
So we have a traditional shader, but itís not traditional at all though because itís a unified shader. So you have the shader instruction set. [pauses] In the past you had a vertex shader and a pixel shader, and the instruction set was different and you couldnít, you know, one couldnít operate on the otherís data. Now we have one set of resources, these 48 shaders, and they naturally dynamically balance between whatever the problem at hand is.
So if itís dominated by vertices you get more resources for vertices, but if itís dominated by pixels you get more resources towards pixels, or any other kind of problem. Itís a general purpose, well not a general purpose processor, but it is a processor with a good general instruction set and it can operate on a variety of different kinds of data. So unified shader means we have one set of shader hardware and it can operate on any problem.
So you have 64 threads, and itís all controlled by hardware so itís not like the programmer knows one way or another about threading at all, and the threads here are things like vertex buffers or pixel programs and the hardware just keeps the same [inaudible] in a thread buffer and we can just switch back and forth between the different threads. That way if weíre waiting for data from a vertex program or vertex array we can go ahead and work on a pixel program or we can work on a second vertex or whatever, a different instruction.