FiringSquad: Home of the Hardcore Gamer - Games, Hardware, Reviews and NewsSubmit your own or view users' CPU overclocking results!

  
 Home   News   THE MATRIX   Deals   Hardware   Games   Features   Media   Products   Forums   FS China 
AddThis Social Bookmark Button

Home : Matrix : Blogs : by indigo196 : NVIDIA 8800GTX EXPOSED AND EXPLOITED
» Join the Greatest Gaming Community NOW! (It's free)

Already a member? Login
 

  Media-Blog Entry User Public Matrix Page Matrix Home
indigo196
into the unknown we go

Permanent Link:
ACTIONS »
- View Profile
- Return to User's Matrix Page
Please login to participate in the Matrix. Login here
 


          CLUSTERS (3)
 
 
View the Linux Games Cluster Page Linux Games  Talk to me in my Shout! Box

View the Computer Security Cluster Page Computer Security  Talk to me in my Shout! Box

View the DELTE ME - Scripting and Programming Cluster Page DELTE ME - Scripting and Programmin...  Talk to me in my Shout! Box

See all available clusters

          FRIENDS (16)
 
 
View Eriond's User Page Eriond (127) Talk to me in my Shout! Box

View Yoda_Blues's User Page Yoda_Blues (263) Talk to me in my Shout! Box

View DaugWok's User Page DaugWok (85) Talk to me in my Shout! Box

View Dant's User Page Dant (697) Talk to me in my Shout! Box

View OgreFade's User Page OgreFade (150) Talk to me in my Shout! Box

View jacobvandy 's User Page jacobvandy (1636) Talk to me in my Shout! Box

View Ging9's User Page Ging9 (107) Talk to me in my Shout! Box

View FS-Pongky's User Page FS-Pongky (1138) Talk to me in my Shout! Box

View Knuckles's User Page Knuckles (1565) Talk to me in my Shout! Box

View FS Demo's User Page FS Demo (43) Talk to me in my Shout! Box

View CanadaDave's User Page CanadaDave (303) Talk to me in my Shout! Box

View kevinSpiess's User Page kevinSpiess (156) Talk to me in my Shout! Box

View rubyofoz's User Page rubyofoz (1) Talk to me in my Shout! Box

View lasan of twain's User Page lasan of twain  Talk to me in my Shout! Box

View sonic64bit's User Page sonic64bit (5) Talk to me in my Shout! Box

View acreade's User Page acreade  Talk to me in my Shout! Box




          VIEWING MEDIA-BLOG ENTRY
 
5 entry(ies) in this category  
Note: You must be logged in to rate this media blog. » Login Average rating »  86 % - 64 User(s)
cool NVIDIA 8800GTX EXPOSED AND EXPLOITED (25 comments )
by: indigo196 (258) | Posted in cluster Round 3 Editors Challenge Sponsored by Intel
Subject: http://www.gpgpu.org
Posted 74 months ago ( edited 74 months ago ) in category DEFAULT

» MEDIA (9)
Click to view full-resolution version
GPU vs CPU Floating Point Operations

Click to view full-resolution version
GPU vs CPU

Click to view full-resolution version
BLAS SEGMM

Click to view full-resolution version
2D Complex FFT

Click to view full-resolution version
European Option Pricing Black-Scholes

Click to view full-resolution version
7900 GPU Diagram

Click to view full-resolution version
8800 GPU Diagram

Click to view full-resolution version
thread processor

Click to view full-resolution version
CUDA software stack

Introduction
The G80 series of cards being produced by NVIDIA provides raw processing power that was previously only available in server clusters or mainframe computers. This power is most often used to produce visually stunning and detailed environments for games, but recent advancements by companies such as RapidMind, PeakStream and Havok prove that these GPUs can be used for a great variety of math intensive computing. I appreciate gorgeous graphics as much as the next gamer, but I would love to see advances made in AI, environmental physics and other elements that improve the immersive quality of the games I play.

General-Purpose computation on GPUs (GPGPU) has recently come to the forefront of technical news, despite getting its start back in the later 1970s.[1] In fact, some experts have labeled GPGPU as one of the “5 Disruptive Technologies To Watch in 2007”.[2] Companies such as PeakStream, Acceleware and RapidMind have achieved astonishing results on 7900GTX cards, the predecessor to the G80 series, with implementations running 120x faster than CPU code. Havok announced a partnership with NVIDIA in early 2006 to produce Havok FX that leverages Shader Model 3.0 class GPUs to enable collisions of thousands of objects in real-time using the GPU instead of the CPU.

To demonstrate the potential that the 8800GTX holds to improve games, I have included some detailed information on the GPGPU achievements that were made using the 7900 series of cards. These achievements are astounding in their own right, but when you compare the architecture of the 7900GTX to that of the 8800GTX you may find it hard to contain your enthusiasm. The fact that these examples are all non-gaming applications should make even the most dedicated gamer proud that their hobby may assist man in solving medical mysteries.

Example: Acceleware
Acceleware was established in 2004 and provides solutions that leverage the power of GPUs to increase performance and processing power. Their intended markets are cell phone manufacturing, energy, seismic, biomedical, fluid dynamics, pharmaceuticals, industrial, and military companies. They created a solution for Boston Scientific that supercharged their simulations by a factor of 25 when compared to CPU based simulations.[3] These simulations allow Boston Scientific “to investigate the influence and mutual dependency of several design variables”.[3] The result is the improvement of MRI devices that will improve the ability of doctors to diagnose patients.

Example: RapidMind
RapidMind is a company based in Waterloo, Canada, that is built on over five years of advanced research and development. The company was formed in 2004 to commercialize the research of Sh that was started at the University of Waterloo. Sh is a library that acts as a language embedded in C++ that allows programmers to use GPUs for general purpose computations. RapidMind has taken the knowledge gained from the development Sh and created the RapidMind Development Platform that makes parallel programming as easy as single-threaded, single core programming. To show the strength of their solution, RapidMind produced three benchmarks: BLAS SGEMM routine, 2D complex-to-complete FFT routine and a quasi-Monte Carlo evaluation of the Black-Scholes option pricing model. These benchmarks were run on a 7900 GT based GPU and high-end workstation or server-class CPUs. The most impressive result was obtained in the European Option Pricing benchmark which showed the RapidMind GPU implementation to be 120x faster than the original CPU code.[4] RapidMind itself claims that “RapidMind–enabled applications have achieved performance increases of 3x to 30x”.[5]

Example: Havok FX
Havok was founded in 1998 in Dublin Ireland and provides software and services for digital media creators in both games and movies industries. At GDC06 Havok FX was announced jointly by Havok and Nvidia. Havok FX is an add-on which allows programmers to leverage the power of GPUs supporting Shader Model 3.0 to produce stunning effects that behave correctly. At GDC06 Nvidia claimed that “Havok FX running on a pair of GeForce 7900GTX graphics cards in SLI is more than ten times faster than software physics calculations running on a Pentium Extreme Edition 955”.[6] Havok FX was released in Q2 of 2006. The list of titles that use Havok software includes The Elder Scrolls IV: Oblivion, F.E.A.R. and Age of Empires III.

The G80 in perspective
All of the above examples were based on GPGPU running on GeForce 7900 graphics cards and the results are nothing short of astounding. GPGPU computation makes use of ALUs in the GPU. The 7900 GT cards had 96 ALUs clocked at 450Mhz [7900 GPU Diagram] while the 8800GTX has 128 ALUs clocked at 1.35Ghz.[thread processor] Let that sink in slowly - 1.3x the number of ALUs each running at 3x the speed. The GeForce 8800GTX actually divides those 128 processors up in to 16 multiprocessors [8800 GPU Diagram]. The 8800GTS has 96 ALUs clocked at 1.2Ghz each grouped in to 12 multiprocessors.

I found some very technical benchmarks done that compared the NVIDIA 7900GTX (G71), NVIDIA 8800GTX (G80) and the ATI X1900XTX (R580) published by Mike Houston of Stanford University.[7] These benchmarks are very technical but do show that the 8800GTX is more powerful than either the 7900GTX or X1900XTX cards.

Thanks DirectX 10!
The reason for the explosion in the useable shaders on the 8800GTX is the DX10 requirement of unified shaders, the geometry shader requirement and no more fixed function components. This resulted in GPUs that are not divided up into ‘x’ number of vertex shaders and ‘y’ number of pixel shaders. The elimination of capability bits will also force vendors to produce cards that meet the same basic requirements, removing the variations in floating-point formats that existed under DX9. This consistency will reduce the confusion that developers faced in utilizing the previous generation of hardware.

CUDA: A New Architecture for GPU Computing
CUDA stands for Compute Unified Device Architecture and is a new hardware and software architecture that enables the GPU to be used as a data-parallel computing device without the need to map to the graphics API. CUDA is an extension of the C programming language which should allow for a minimum learning curve for developers. CUDA is available on the GeForce 8800 series and future products.

Game Development Potential
The GPGPU results from Acceleware and RapidMind coupled with the work of Havok in the arena of games proves that there is potential in harnessing the power of the GPU beyond making games visually stunning. Havok has already started to improve the implementation of physics in game environments, but that is only one part of a game. This next part is theoretical on my part and I will suggest areas in which some of today’s games could be improved by tapping in to the power of data-parallel programming on the GPU.

Neverwinter Nights 2 and other single player games
The single player experience in Neverwinter Nights 2 is hampered by the poor AI that controls your companions. The path-finding AI works for most of the open areas, but fails miserably when your party is exploring dungeons, underground caverns or building interiors. Computer controlled companions often get stuck on terrain or simply lost leaving you to get trounced by encounters created for a party of four. While you could simply pause the game and make individual adjustments, that process breaks the level of immersion in the game. The AI also has problems while controlling spell-casters, allowing your companions to burn through their offensive spells in situations that do not require them and failing to have them use healing spells when party members are on the brink of death. Given the performance improvements that were shown above, I have to wonder how much more realistic the AI could have been if the developer had been able to make use of the computational power in the 8800GTX.

F.E.A.R. and other FPS games
F.E.A.R. is a game that relies heavily on the spooky factor. As a player you are immersed with creepy atmospheric environmental effects such as steam, smoke and particles floating in beams of light. AI in F.E.A.R. was some of the best seen in recent shooters as well as the physics effects from shooting bad guys. Injury, death and environmental damage were handled elegantly. So why do I bring this game up? Simple. More could still be done. Imagine using the power of the GPU to generate a dynamic map as the result of chemical spills or burning liquids applied in real-time. The immersion level in the game would be greatly increased.

World of Warcraft and other MMOs
Economy is always an issue in MMOs and no one ever seems happy about how an in-game economy is modeled. Certainly game developers struggle to achieve a realistic economic system that can react to unanticipated fluctuations caused by players. In this context I think about the RapidMind benchmark for European Pricing Options that ran 120x faster on a 7900 based GPU than the original benchmark did on a CPU. Apply this muscle to controlling the actions of NPC traders and MMO economies would take on a complex life of their own that react to player-induced trading frenzies.

Ballistics Report

Pros
• The 8800GTX has 128 ALUs vs 96 ALUs on the 7900GTX and they are also clocked 3x higher
• NVIDIA has released the CUDA SDK to assist developers in exploiting the power of the GPU in GPGPU programming
• Companies like RapidMind, Acceleware and Havok are making it easier to implement GPGPU strategies
• GPUs have far outstripped CPUs in processing Floating Point Operations
• GPUs have a large installed base that add-on cards would have to build
• GPUs would be cheaper to use on server-side implementations than buying server clusters

Cons
• GPGPU programming remains difficult and requires programmers to think differently about their applications
• The power of DX10 compatible parts is crucial to expanding GPGPU implementations due to explosion of shaders required to meet the specification, but the installed base of DX10 cards in the near future will be low

Final Verdict – 100% excitement about possibilities
GPGPU implementations show greatly improved processing capabilities over CPU solutions and the introduction of DX10 compatible parts should increase that. Companies like RapidMind, Acceleware and Havok are making it easier for traditional programmers to leverage GPUs in their applications. NVIDIA and ATI, with CUDA and CTM respectively, are building tools to expose their GPUs to a greater extent to GPGPU programmers. The 8800GTX is a tremendous leap forward in computational power for GPGPU applications both in the world of computer games and in real-world simulations. It gives me a warm fuzzy feeling knowing that the power of my GPU, which so often sits wasted while I perform common task like reading Firingsquad.com, could be used in programs similar to Folding@home to cure diseases.

[1] History of GPGPU -- http://www.gpgpu.org/data/history.shtml
[2] 5 Disruptive Techologies To Watch In 2007 by David Strom -- http://www.informationweek.com/internet/showArticle.jhtml?articleID=196800208
[3] Acceleware and Boston Scientific -- http://www.nvidia.com/object/acceleware_boston_scientific_success.html
[4] RapidMind GPU Evaluation -- http://rapidmind.net/case-gpu.php
[5] RapidMind -- http://rapidmind.net/product.php
[6] The Tech Report -- http://www.techreport.com/onearticle.x/9610
[7] Understanding GPUs Through Benchmarking -- http://www.cse.ohio-state.edu/~kerwin/GPGPUPerformance.pdf

(« prev) 11 of 11 (next ») In cluster: Round 3 Editors Challenge Sponsored by Intel » Flag this
Note: You must be logged in to rate this media blog. » Login Average rating »  86 % - 64 User(s)


25 User Comment(s) • 13 root comment(s)
Previous Page  Page 2 of 3Next Page
Click to view indigo196's User Page indigo196 (258)  Click to view indigo196's User Profile Talk to indigo196 in the Shout! Box Apr 03, 2007 - 09:10 am
Made a correction based on some more information that I consumed over lunch break.

» Login to reply to this


Click to view Wisd85's User Page Wisd85 (15)  Talk to Wisd85 in the Shout! Box Apr 03, 2007 - 08:03 am
1 mistake in your article. 8800GTX does not have 128 pixel pipelines, they are all unified pipelines.

» Login to reply to this
Click to view indigo196's User Page indigo196 (258)  Click to view indigo196's User Profile Talk to indigo196 in the Shout! Box Apr 03, 2007 - 08:18 am | Edited on Apr 04, 2007 - 06:05 pm
Good catch... will correct that... meant to say unified shaders...

Also shaders are individual processing units (ALUs as I found out); pipelines are different.

» Login to reply to this



Click to view Kerrick's User Page Kerrick (217)  Talk to Kerrick in the Shout! Box Apr 03, 2007 - 05:06 am
It seems that while this line of thinking will be good for quite a few applications, it will not be good for gaming. How much of the gpu is left to do all the calculations for physics, pathing, ai, etc... if the card is already being worked to it's max putting the intense graphics on the screen? there has to be a trade-off somewhere.

and economies in games like WoW? that is completely controlled by the people playing the game, not by the game itself. The computer doesn't set prices for things that are bought and sold between players. Thats why you can play on 2 different servers and have completely different economies on them even though it's the same game.

but it is a well written article. One of the few I have seen in this contest.

» Login to reply to this
Click to view indigo196's User Page indigo196 (258)  Click to view indigo196's User Profile Talk to indigo196 in the Shout! Box Apr 03, 2007 - 07:26 am | Edited on Apr 03, 2007 - 07:33 am
Thanks for the compliment on the writing.

I think the potential is there to use GPUs for more than graphics at the high end -- there is really no point in getting FPS results that are north of 60 so I would love to see that extra power be used to produce better game play.

I was impressed with the information for Havok and ATI about running Physics processing using the GPU instead of on a dedicated card.

My thoughts on improving the economies of MMOs are that by utilizing some GPUs on the host servers it might be possible to have AI controlled vendors bid at auctions, and dynamic pricing of goods at the shops that are impacted by the "NPC world". The demand for goods is usually player based only which leads to a bulge in the economy at whatever point the majority of the players are at.

» Login to reply to this


Click to view Knuckles's User Page Knuckles (1565)  Click to view Knuckles's User Profile Talk to Knuckles in the Shout! Box Apr 03, 2007 - 06:22 am | Edited on Apr 03, 2007 - 07:15 am
Well, the companies are always trying to sell you SLI, so they can double the sale. So it may be better to buy another GPU than a physics processor.

» Login to reply to this



Click to view wawatikigod's User Page wawatikigod (2)  Talk to wawatikigod in the Shout! Box Apr 02, 2007 - 11:44 am
It's a very good article Buc, I learned more about the programs that run my favorite games. Great Job and good luck.

» Login to reply to this


Click to view indigo196's User Page indigo196 (258)  Click to view indigo196's User Profile Talk to indigo196 in the Shout! Box Apr 02, 2007 - 04:47 am | Edited on Apr 02, 2007 - 04:47 am
NOTE: The edit was to correct the t-i-t-l-e and one spelling error.

» Login to reply to this


Previous Page  Page 2 of 3Next Page

POST A COMMENT

» Note: You need to be logged in to write a comment!

Login here, or if you don't have an account with FiringSquad, register here, it's FREE!


My Media-Blog categories
DEFAULT (5)VIEW
Game Reviews (2)VIEW
Linux Games (1)VIEW

» Return to indigo196's Matrix Page