Summary: By now we've all heard of the various anti-aliasing techniques employed by the latest video cards. Multi-sampling is currently the implemenation of choice for ATI and NVIDIA, but what about fragment level anti-aliasing such as that used by the Matrox Parhelia? In this article, Dave explores the advantages and pitfalls of Fragment AA. See how it works in today's article!
It is often true that many paths can reach a destination. This is true in life with reaching success both financially and with members of the opposite sex, where different people will apply different approaches to doing so. Yet, while different paths are taken, all are potentially successful. But this isn’t a site about dating and finances, now is it? We are here to discuss hardware and games, so how does this relate? Achieving a certain result in chip design can be accomplished by a variety of means. Pixel shaders, for example, can be approached from many different fashions. Not only that, but each method used can be configured in very specific ways for die size considerations and operation optimization. However, in this discussion we will not be looking at pixel shader architecture. Rather, today, we will discuss the use of coverage masks in fragment level anti-aliasing algorithms. @serve_inline_ad( $current_section ); %> How does one relate taking different roads to anti-aliasing? Well just as with the pixel shader example that was mentioned, anti-aliasing can also be effective by using a variety of different techniques. With each method, the end result is effectively the same, yet each carries with it a set of advantages and disadvantages. While this article is to focus on fragment level anti-aliasing algorithms, it is important for us to understand multi-sampling implementations as well. With such systems becoming standard on all NVIDIA hardware, and ATI now using it in their high-end RADEON 9500 and RADEON 9700 boards, it is quickly becoming the choice anti-aliasing technique of the graphics industry. Understanding this approach will allow comparisons to be made with fragment level approaches, and thus the determination of what implementation is most desired. Multi-Sampling Explained
Multi-Sampling algorithms share many characteristics of super-sampling. Both render a scene to a high-resolution buffer and filter down to achieve the anti-aliasing effect. Yet, that is where the similarities end.
Displaying the final image
The first step in multi-sampling is to consider the scene theoretically double in size in both the horizontal and vertical (thus, the picture is a total of four times larger). A triangle edge mask is used to locate those pixels that fall along each triangle edge. Edge pixels take a unique color sample for every sub-pixel, writing four separate color and Z values to their respective buffers. On the other hand, pixels not located on a triangle edge will all share the same original color value, writing four identical color values to the color buffer. Every depth (Z) value will be independently calculated. Fragment Level Algorithms
Fragment level algorithms do not work on a sub-pixel level, but rather on a fragment level. A sub-pixel is effectively an entire pixel of its own, whereas a fragment is simply a segment of a complete pixel. A sub-pixel will store a full color and Z value, where a fragment will only store information regarding a segment of a complete pixel.
When rendering a scene while using fragment level anti-aliasing (such as that used by Matrox Parhelia-512) one basically performs all operations normally. The scene is rendered and all is well. The variation occurs in that an additional stage is added to the pixel pipeline. This stage uses a coverage mask on each pixel, with each section of the mask being a fragment. When laid over an edge pixel, it is determined as to whether or not each fragment falls within the triangle in question. If the fragment is found inside the triangle, it is assigned a value of one, where if the fragment is outside of the edge, the assign value is zero. [image]
With the fragment level data having been determined for a pixel, this data is carried along the pipeline as the pixel is rendered. When the pixel is completed, it is written as a normal pixel, without any anti-aliasing. The fragment data is written to a fragment buffer and stored for later use. Filtering must take place after completion of the scene and this is where the fragment data comes into play. With numerous filter types existing, the exact method used is entirely up to the engineers designing the system. Any filter shape can be used, such as a box, X, or crossed shape, or even the ever popular quincunx pattern. The filtering process takes place by making note of what percentage of the pixel is within the triangle and what percentage is outside of it. The side(s) of the pixel that are in and out of the triangle must be considered as well to properly filter. With the filtering information available, the pixel in question is blended with neighboring pixels. While technically blurring, and thus not ideal, one might consider this a “smart blur” in that it only does so along edges so as not to distort the image and blurs at a calculated level with the pixels that are calculated to be used for such. We will discuss the level of quality actually delivered later in this article. Specific Techniques
There are a variety of ways to implement a fragment level anti-aliasing algorithm. For example, Matrox has their proprietary implementation (which they simply coin FAA), as well as Bitboys with their MatrixAA. Other implementations exist, though none have seen the light of day in hardware. With that said though, we will discuss a couple of implementations so as to better understand how the different algorithms function.
Our first implementation (from Matrox) operates by buffering on a scanline basis. It is likely that Matrox does something similar to this for they provide a 128-pixel buffer on their chip. As each scanline is rendered one at a time, the edge pixels are located and written to an on-chip buffer, thus creating a fragment list. This list stores the location of the edge pixels, as well as additional Z, color, and alpha data. One can only speculate what Matrox does with the additional data stored within the fragment buffer, as they will not reveal it for patent reasons. The additional Z data is likely for the portion of the pixel that falls outside the original triangle. With calculating the percentage of a pixel within a triangle, it is logical that the section within the triangle will provide a different depth value than the section outside of the triangle. Without this additional data it is possible that the edge would mistakenly read the neighboring triangle, providing false coverage information, producing artifacts. The reason for providing the additional color and alpha data is a question that remains somewhat less understood. Perhaps it is there to provide additional blending information, or perhaps the anti-aliasing filter is unable to access the depth and color buffer, requiring all edge data (color, alpha, Z) to be stored in the fragment buffer. Whatever the case is, each edge pixel’s data is stored within the fragment buffer. Potential Setbacks
Now this presents an interesting situation. First and foremost, the question of buffer size becomes an issue. What happens if a certain scanline has more than 128 edge pixels in it? Perhaps Matrox has implemented the ability to write portions of the scanline, move it to external memory, flushing the cache and then resume the scanline.
Alternatively, one might store coverage data for the entire scene in local memory. Such a method would place a coverage mask over every available pixel, storing the data for each pixel, even if no edge exists. This likely increases buffer storage requirements, as every pixel requires a stored value, when compared to Matrox’s exclusively storing edge pixels. This alternate implementation offers a couple of key advantages. First and foremost, every pixel has coverage data available, thus solving the issue with certain edges remaining aliased. This is done by reading back the pixel value and recalculating the edge. Additionally, there is no concern of buffer overflows, which would either result in additional aliased pixels or pipeline stalls. Performance
Fragment level algorithms hold a key advantage in performance over multi-sampling and super-sampling algorithms. Where super-sampling requires 75% more fill-rate and bandwidth and multi-sampling requires nearly the same level of bandwidth, fragment level algorithms require no additional fill-rate, nor do they consume such relatively high levels of bandwidth. Quality
Examining the quality of a fragment level algorithm is rather tricky. Super-sampling’s quality increase is very linear in that if you set an anti-aliasing level you know exactly what to expect. On the other hand, fragment level algorithms can provide a variety of different results.
Finally, the last obstacle of fragment level algorithms is that of blurring. While generally not a problem, it can become one when near pixel size triangles are used. For example, if a group of 1-2 pixel triangles are joined together and the filter is used over all of them, texture detail will be lost to do each pixel being an edge and thus requiring filtering. SIDEBAR: Tuan took some screenshots of 16x Fragment AA in his review, but to be honest the best way to judge AA quality really is to sit down and look at it. Also keep in mind that it can be subjective.
Final Thoughts
The two approaches being used at present are multi-sampling and fragment level algorithms. Certain low-end parts continue to provide super-sampling exclusively as it requires no additional hardware, but with the next generation of graphics processors we can expect this to likely phase out, with all parts adopting multi-sampling or fragment level algorithms.
| |||||||||||||||||||||||||||||||
| © Copyright 2003 FS Media, Inc. |