Deferred Rendering
The black sheep of the Z family
Deferred rendering is truly different than any traditional architecture. Of the technologies currently out there, this is likely the most radical of them all. This, of course, is what is found in PowerVR, as well as others such as Gigapixel. ATI is believed to be presently working on a deferred rendering architecture as well, and they’ve recently received several patents on it. Deferred rendering’s basic concept is to delay rendering each scene by one frame, making it possible to do additional work on the scene, namely removing hidden surfaces.
The first stage of deferred rendering is known as “sorting and binning.” This stage is carried out by determining where geometry is located within a scene (think of it as a rough rendering of the scene) and writing that geometry to a scene buffer. The actual location of the geometry within the scene is stored in bins as pointers. The pointer lists store the location of the geometry for each tile, with the scene being broken it tiles of 32x16 or 32x32 pixels.
After the binning stage rendering takes place, it will render the scene one tile at a time. In doing this, it pulls up our first pointer list and see what geometry is needed for use, and then place the geometry into the tile buffer. Now pixel visibility is determined. There are different ways to do this, depending on the architecture. With this method, it is possible to determine the pixels that are occluded and then cull those out of the scene. The visible pixels are then textured.
Of course, everyone’s got their own ideas
It is worth noting that PowerVR handles scene data somewhat differently. With PowerVR, a scene is converted into “Infinite Planes.” Infinite Planes is a different mathematical representation of a scene. To determine pixel visibility, they use a process known as ray casting. This works by shooting a ray through each pixel. In doing this it is determined which is the first visible opaque pixel and culls out any other pixels beyond that point. In order to keep performance up, the ray casting is done by a massively parallel system, shooting a separate ray through many pixels at once. The end result is the same as doing it any other way. PowerVR just uses this slightly different approach.
We see that the advantage is that only the visible pixels are rendered. All hidden pixels are removed before texturing. The next advantage is that there is no longer a need to have multiple frame-buffer accesses. Instead, there is only a single write per-tile. Of course it is possible to do this on a traditional architecture as well in most cases. The result of this can be a considerable reduction in color buffer read/writes. Since the Z-buffer is on-chip, using a 32-bit Z-buffer does not use additional bandwidth either. Another advantage is that multi-sampling anti-aliasing can be done nearly for free. The only difference with that being that there is a need for more, smaller tiles and thus more bins. There are other advantages as well, but these are some of the key ones.
Deferred rendering is not perfect though. PowerVR, for example, has proven to have a lot of hardware-related issues with certain applications. Their drive team has been pretty successful at addressing many of these issues, but they still do crop up. Many of the issues are addressed by simply disabling certain functionalities in the deferred architecture; bring it closer to the functionality of an immediate mode renderer. This, however, reduces the benefits of the architecture too. There is discussion that in the future deferred rendering may have issues with highly complex scenes, where large amounts of geometry must be binned. There are ways to address issues such as this, but what the eventual outcome is remains to be seen.