Abstract
We present a system architecture for the 4th generation of PC-class programmable graphics processing units (GPUs). The new pipeline features significant additions and changes to the prior generation pipeline including a new programmable stage capable of generating additional primitives and streaming primitive data to memory, an expanded, common feature set for all of the programmable stages, generalizations to vertex and image memory resources, and new storage formats. We also describe structural modifications to the API, runtime, and shading language to complement the new pipeline. We motivate the design with descriptions of frequently encountered obstacles in current systems. Throughout the paper we present rationale behind prominent design choices and alternatives that were ultimately rejected, drawing on insights collected during a multi-year collaboration with application developers and hardware designers.
Supplemental Material
- Ati. 2005. Radeon X800 3D Architecture White Paper. http://www.ati.com/products/radeonx800/RadeonX800ArchitectureWhitePaper.pdf.Google Scholar
- Akeley, K. 1993. RealityEngine graphics. In Proceedings of ACM SIGGRAPH 1993. ACM Press, New York, NY, 109--116. Google ScholarDigital Library
- Blinn, J. F. 1990. The truth about texture mapping. IEEE Computer Graphics and Applications 10, 2, 78--83. Google ScholarDigital Library
- Buck, I. Foley, T., Horn, D., Sugerman, D., Fatahalian, K., Houstin, M., and Hanrahan, P. 2004. Brook for GPUs: Stream computing on graphics hardware. Transactions on Graphics 23, 3, 777--786. Google ScholarDigital Library
- Chan, E., Ng, R., Sen, P., Proudfoot, K., and Hanrahan, P. 2002. Efficient Partitioning of Fragment Shaders for Multipass Rendering on Programmable Graphics Hardware, In Graphics Hardware, 69--78. Google ScholarDigital Library
- Doggett, M., 2005. Xenos: XBox 360 GPU. GDC-E 2005, http://www.ati.com/developer/eg05-xenos-doggett-final.pdf.Google Scholar
- Gray, K. 2003. The Microsoft DirectX 9 Programmable Graphics Pipeline. Microsoft Press. Google ScholarDigital Library
- Haines, E. 2006. An Introductory Tour of Rendering. IEEE Computer Graphics and Applications 26, 1, 76--87. Google ScholarDigital Library
- Hakura, Z. S., and Gupta, A. 1997. The design and analysis of a cache architecture for texture mapping. ACM SIGARCH Computer Architecture News 25, 2, 108--120. Google ScholarDigital Library
- Ieee Computer Society. 1985. IEEE Standard for Binary Floating-Point Arithmetic. IEEE Std 754--1985.Google Scholar
- Igehy, H., Eldridge, M., and Hanrahan, P. 1999. Parallel Texture Caching. In Graphics Hardware, ACM Press, New York, NY, 95--106. Google ScholarDigital Library
- Kessenich, J., Baldwin, D., and Rost, R. 2004. The OpenGL Shading Language version 1.10.59. http://www.opengl.org/documentation/oglsl.html.Google Scholar
- Lindholm, E., Kilgard, M. J., and Moreton, H. 2001. A User-programmable vertex engine. In Proc. of SIGGRAPH 2001, ACM Press / ACM SIGGRAPH, 149--158. Google ScholarDigital Library
- Mark, W. R., Glanville, R. S., Akeley, K., and Kilgard, M. J. Cg: A system for programming graphics in a C-like language. Transactions on Graphics 22, 3, 2003, 896--907. Google ScholarDigital Library
- McCabe, D., and Brothers, J. 1998. DirectX 6 Texture Map Compression. Game Developer Magazine 5, 8. 42--46.Google Scholar
- McCool, M. and Du Toit, S. 2004. Metaprogramming GPUs with Sh. A K Peters. Google ScholarDigital Library
- McCormick P. S., Inman, J., Ahrens, J. P., Hansen, C., and Roth, G. 2004, Scout: A hardware-accelerated system for quantitatively driven visualization and analysis. In Proc. of IEEE Visualization, 171--178. Google ScholarDigital Library
- Microsoft Corp. 2002. High-level shader language. In DirectX 9.0 graphics. http://msdn.microsoft.com/directx.Google Scholar
- Microsoft Corp. 2006, Direct3D 10 Reference. In Direct3D 10 graphics. http://msdn.microsoft.com/directx.Google Scholar
- Montrym, J., and Moreton, H. 2005. The GeForce 6800. IEEE Micro 25, 2, 41--51. Google ScholarDigital Library
- Proudfoot, K., Mark, W. R., Tzvetkov, S., and Hanrahan, P. 2001. A real-time procedural shading system for programmable graphics hardware. In Proc. of SIGGRAPH 2001, ACM Press / ACM SIGGRAPH, 159--170. Google ScholarDigital Library
- Riffel, A., Lefohn, A. E., Vidimce, K., Leone, M., and Owens, J. D. 2004. Mio: Fast Multipass Partitioning via Priority-Based Instruction Scheduling. In Graphics Hardware, 35--44. Google ScholarDigital Library
- Rohlf, J. and Helman, J. 1994. IRIS Performer: a high performance multiprocessing toolkit for real-time 3D graphics. In Proc. of SIGGRAPH '94. ACM Press, New York, NY, 381--394. Google ScholarDigital Library
- Segal, M., and Akeley, K. 2004. The OpenGL Graphics System: A Specification (Version 2.0). http://www.opengl.org/documentation/spec.html.Google Scholar
- Tarditi, D., Puri, S., and Oglesby, J. 2005. Accelerator: simplified programming of graphics units for general-purpose uses via data parallelism. Technical Rerport, MSR-TR-2005-184.Google Scholar
- Everitt, C. and Kilgard, M. 2002. Practical and Robust Stenciled Shadow Volumes for Hardware-Accelerated Rendering. http://developer.nvidia.com.Google Scholar
- Hirche, J., Ehlert, A. Guthe, S. and Doggett, M. 2004.Hardware accelerated per-pixel displacement mapping. In Proc. of Graphics Interface 2004, 153--160. Google ScholarDigital Library
Index Terms
- The Direct3D 10 system
Recommendations
The Direct3D 10 system
SIGGRAPH '06: ACM SIGGRAPH 2006 PapersWe present a system architecture for the 4th generation of PC-class programmable graphics processing units (GPUs). The new pipeline features significant additions and changes to the prior generation pipeline including a new programmable stage capable of ...
Interactive multi-pass programmable shading
SIGGRAPH '00: Proceedings of the 27th annual conference on Computer graphics and interactive techniquesProgrammable shading is a common technique for production animation, but interactive programmable shading is not yet widely available. We support interactive programmable shading on virtually any 3D graphics hardware using a scene graph library on top ...
The lightspeed automatic interactive lighting preview system
SIGGRAPH '07: ACM SIGGRAPH 2007 papersWe present an automated approach for high-quality preview of feature-film rendering during lighting design. Similar to previous work, we use a deep-framebuffer shaded on the GPU to achieve interactive performance. Our first contribution is to generate ...
Comments