ATI lovingly refers to the architecture behind RV770 as their Terascale graphics engine. This moniker is an obvious nod at its distinction as the first desktop graphics card to break the 1 TeraFlop mark (Radeon 4870 actually boasts 1.2 TeraFlops), and as you can see the new GPU boasts some impressive specs:
Specifications
Unified Superscalar Shader Architecture
800 stream processing units
Dynamic load balancing and resource allocation for vertex, geometry, and pixel shaders
Common instruction set and texture unit access supported for all types of shaders
Dedicated branch execution units and texture address processors
128-bit floating point precision for all operations
Command processor for reduced CPU overhead
Up to 160 texture fetches per clock cycle
Up to 128 textures per pixel
DXTC and 3Dc+ texture compression
High resolution texture support (up to 8192 x 8192)
Physics processing support
Microsoft® DirectX® 10.1 support
Shader Model 4.1
32-bit floating point texture filtering
OpenGL 2.0 support
Dynamic Geometry Acceleration
High performance vertex cache
Programmable tessellation unit
Accelerated geometry shader path for geometry amplification
Memory read/write cache for improved stream output performance
Anti-aliasing features
Multi-sample anti-aliasing (2, 4 or 8 samples per pixel)
Up to 24x Custom Filter Anti-Aliasing (CFAA) for improved quality
Adaptive super-sampling and multi-sampling
Gamma correct
Super AA (ATI CrossFireX™ configurations only)
All anti-aliasing features compatible with HDR rendering
Texture filtering features
2x/4x/8x/16x high quality adaptive anisotropic filtering modes (up to 128 taps per pixel)
128-bit floating point HDR texture filtering
ATI Avivo™ HD Video and Display Platform
2nd generation Unified Video Decoder (UVD 2)
Enabling hardware decode acceleration of H.264, VC-1 and MPEG-2
Dual stream playback (or Picture-in-picture)
Hardware MPEG-1, and DivX video decode acceleration
Motion compensation and IDCT
ATI Avivo Video Post Processor
New enhanced DVD upconversion to HD new!
New automatic and dynamic contrast adjustment new!
Color space conversion
Chroma subsampling format conversion
Horizontal and vertical scaling
Gamma correction
Advanced vector adaptive per-pixel de-interlacing
De-blocking and noise reduction filtering
Detail enhancement
Inverse telecine (2:2 and 3:2 pull-down correction)
Bad edit correction
Full score in HQV (SD) and HQV (HD) video quality benchmarks
Two independent display controllers
Drive two displays simultaneously with independent resolutions, refresh rates, color controls and video overlays for each display
Full 30-bit display processing
Two integrated DVI display outputs
Primary supports 18-, 24-, and 30-bit digital displays at all resolutions up to 1920x1200 (single-link DVI) or 2560x1600 (dual-link DVI)
Two integrated 400MHz 30-bit RAMDACs
Each supports analog displays connected by VGA at all resolutions up to 2048x15363
DisplayPort™ output support
Supports 24- and 30-bit displays at all resolutions up to 2560x16003
HDMI output support
Supports all display resolutions up to 1920x10803
Integrated HD audio controller with up to 2 channel 48 kHz stereo or multi-channel (7.1) AC3 enabling a plug-and-play cable-less audio solution
Integrated AMD Xilleon™ HDTV encoder
Provides high quality analog TV output (component/S-video/composite)
Supports SDTV and HDTV resolutions
Underscan and overscan compensation
MPEG-2, MPEG-4, DivX, WMV9, VC-1, and H.264/AVC encoding and transcoding
Seamless integration of pixel shaders with video in real time
VGA mode support on all display outputs
ATI PowerPlay™
Advanced power management technology for optimal performance and power savings
Performance-on-Demand
Constantly monitors GPU activity, dynamically adjusting clocks and voltage based on user scenario
Clock and memory speed throttling
Voltage switching
Dynamic clock gating
Central thermal management – on-chip sensor monitors GPU temperature and triggers thermal actions as required
ATI CrossFireX™ Multi-GPU Technology
Scale up rendering performance and image quality with two GPUs
Integrated compositing engine
High performance dual channel bridge interconnect
956 million transistors on 55nm fabrication process
PCI Express 2.0 x16 bus interface
256-bit GDDR3/GDDR5 memory interface
Notes
Numbers that jump out at you are obviously the increase in the number of stream processors – up from 320 in R600 to 800 in RV770! This increase (along with several others) bumps the transistor count from 666 million in RV670 up to 965 million transistors in RV770. Illustrating RV770’s efficiency, ATI was able to pull this off with a die that’s only about 30% larger than RV670, despite the fact that both GPUs are made on the same 55-nm manufacturing process. All the key features found in RV670 remain, such as DirectX 10.1 support, the tessellation unit, and PCI Express 2.0, while ATI’s added a tweaked unified video decoding engine that boasts new capabilities while the chip boasts a new microcontroller that constantly monitors thermal and activity usage of various blocks within the GPU. The microcontroller controls clock gating, clock speeds, and voltages to ensure the GPU is running at peak power efficiency.
You also can’t miss ATI’s nod to GPU-based physics in the specs listed above. ATI has partnered with archrival Intel (who owns Havok) to make this possible. While both sides haven’t announced anything specific yet when it comes to implementation, we’ve been told that they’re actively looking into areas where it makes sense for the GPU to handle physics rather than the CPU. In those cases RV770 would presumably be used to handle those specific effects rather than the CPU.
Basically they’re not looking to replace the CPU for handling in-game physics anytime soon, but complement it. We’ll have to wait for more details on which games may potentially take advantage of the technology though.