Solving the small batch problem
Earlier this year, ATI conducted a very interesting presentation about today’s gaming and where it could take us with DirectX 10. They discussed the issues of DirectX 9 and the bottlenecks therein and explained how changes implemented in DirectX 10 would help to reduce these issues. This is a great time to elaborate on some of ATI’s points made in the presentation, as some of these points have been discussed earlier in this article and can help explain the future direction of gaming in DX10.
The first topic that is mentioned in the presentation is what DirectX 9 has to offer. Detailed characters, complex materials, and lighting effects such as HDR lighting are some of the things that they make mention of. DirectX 9 currently makes great use of these and this is why games today look as good as they do. They go on by saying that game developers are doing a great job of bringing a false reality closer to life than ever before. But then the presentation gets to the issue at hand, DirectX 9’s overhead.
As it stands now, game developers are getting close to utilizing DX9 to its fullest potential. But eventually software developers are going to get to a point where they can do no more with DirectX 9’s feature set because of bottlenecks and constraints they are encountering in the API. The issue at hand is in the DX9 pipeline and how it functions.
In the DirectX 9 pipeline, the app feeds the API objects. In this case, an object can be anything in the scene, an example would be a character model. (In fact, complex characters may be composed of many objects.)
In the current DX9 pipeline, the object passes from the application to the API; the API in turn will feed these objects to the driver, and then ultimately to the graphics hardware. The issue is that each time the object is passed from the API to the driver, it introduces a bit of overhead. With one scene requiring dozens of objects for the driver to handle, this can drastically affect execution time to process them. Longer execution time directly relates to lower performance, also known as the small batch problem.
While DirectX 10 can’t remove this overhead entirely, it is significantly reduced in DX10 thanks to new state objects. ATI’s slides indicate significantly less execution time will be devoted towards the API+Driver in DX10 (40% in DX9 vs 20% in DX10), which will allow developers to put more objects, materials, and other eye candy effects in their DX10 games.