Equipment in the FA lab
Despite having access to some state of the art automated testing equipment, the failure analysis lab also houses a full-scale chemistry lab where NVIDIA can do a composition analysis. This part of the lab isn’t a relic, it’s a fully functional lab -- they even test their emergency eyewash station on a weekly basis like they are supposed to. They also have some precision sanding wheels which let them take a layer of silicon off at a time.
Although the Agilent test devices do a great job with analysis, Howard’s team also has several other tools. One of those is a QFI InfraScope which helps NVIDIA isolate even more complex failures (especially as ICs get more complex too). Sometimes, NVIDIA has to test their chips in an actual functioning environment. To do that, NVIDIA runs their chips in a production system. Since they need to run the GPUs without a heatsink, NVIDIA relies on four Peltier elements to keep things cool.
At this point, they can use the InfraScope to take a thermal image of the chip to identify problem spots. They can also do a dynamic analysis by using a laser to heat up individual groups of transistors to help isolate the defective part.
Recently, NVIDIA added an Advantest T2000 to their arsenal. This was originally purchased to improve validation of the RSX chip inside the PlayStation 3 (Sony fabs use this instead of Agilent’s product). Howard was so impressed by the product that he’s starting to use for other chips as well.
As I’m sure you’ve noticed, we’ve taken pictures of several of NVIDIA older chips. That’s because NVIDIA will receive older chips that have failed in the field for analysis. By understanding the mechanism of failure, NVIDIA can improve the design of newer ASICs. Therefore, NVIDIA’s Silicon Failure Analysis lab has a library of every chip that has been manufactured. These boards make it possible to map every pin of the IC to a conductor on the board itself. These boards allow NVIDIA to interface their chips with the testing equipment.
We can go on and on about all the cool toys they have in the lab, such as the tools they use to evaluate ESD durability (through better chip design, NVIDIA’s modern GPUs are more resistant to static electricity than their earlier GPUs), or inverted infrared microscopes that allow them to see through flip chips, but it’s time to move onto the next part of the facility.