Introduction
The graphics inside Microsoft’s Xbox 360 chip breaks new ground in several new ways. The console sports a unified shader architecture, 10MB of embedded memory, and what ATI calls “48 perfectly efficient shaders” among its long list of features. But we still had tons of questions.
In order to glean more about the graphics inside Xbox 360 and its architecture, we recently had the chance to speak with Bob Feldstein, ATI’s VP of Engineering on Xbox 360:
ATI: We have 48 shaders. And each shader, every cycle can do 4 floating-point operations, so that gives you 196. There’s a 192 number in there too, so I’m just going to digress a little bit. The 192 is actually in our intelligent memory, every cycle we have 192 processors in our embedded intelligent memory that do things like z, alpha, stencil. So there are two different numbers and they’re kind of close to each other, which leads to some confusion.
![Inside Xbox 360's Graphics [ Xbox 360 VPU block diagram @ 512 x 393 ] > View Full-Size in another window.](images/01-s.png) Xbox 360 VPU block diagram
|
|
So we have a traditional shader, but it’s not traditional at all though because it’s a unified shader. So you have the shader instruction set. [pauses] In the past you had a vertex shader and a pixel shader, and the instruction set was different and you couldn’t, you know, one couldn’t operate on the other’s data. Now we have one set of resources, these 48 shaders, and they naturally dynamically balance between whatever the problem at hand is.
So if it’s dominated by vertices you get more resources for vertices, but if it’s dominated by pixels you get more resources towards pixels, or any other kind of problem. It’s a general purpose, well not a general purpose processor, but it is a processor with a good general instruction set and it can operate on a variety of different kinds of data. So unified shader means we have one set of shader hardware and it can operate on any problem.
So you have 64 threads, and it’s all controlled by hardware so it’s not like the programmer knows one way or another about threading at all, and the threads here are things like vertex buffers or pixel programs and the hardware just keeps the same [inaudible] in a thread buffer and we can just switch back and forth between the different threads. That way if we’re waiting for data from a vertex program or vertex array we can go ahead and work on a pixel program or we can work on a second vertex or whatever, a different instruction.