Skip to main content

A Single (Unified) Shader GPU Microarchitecture for Embedded Systems

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3793))

Abstract

We present and evaluate the TILA-rin GPU microarchitecture for embedded systems using the ATTILA GPU simulation framework. We use a trace from an execution of the Unreal Tournament 2004 PC game to eval uate and compare the performance of the proposed embedded GPU against a baseline GPU architecture for the PC. We evaluate the different elements that have been removed from the baseline GPU architecture to accommodate the architecture to the restricted power, bandwidth and area budgets of em bedded systems. The unified shader architecture we present processes verti ces, triangles and fragments in a single processing unit saving space and re ducing hardware complexity. The proposed embedded GPU architecture sustains 20 frames per second on the selected UT 2004 trace.

This work has been supported by the Ministry of Science and Technology of Spain and the European Union (FEDER funds) under contracts TIC2001-0995-C02-01 and TIN2004-07739-C02-01.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lindholm, E., et al.: An User Programmable Vertex Engine. In: ACM SIGGRAPH 2001 (2001)

    Google Scholar 

  2. WO02103638: Programmable Pixel Shading Architecture, December 27, NVIDIA CORP. (2002)

    Google Scholar 

  3. Beyond3D Graphic Hardware and Technical Forums, http://www.beyond3d.com

  4. DIRECTXDEV mail list, http://discuss.microsoft.com/archives/directxdev.html

  5. Aila, T., Miettinen, V., Nordlund, P.: Delay streams for graphics hardware. ACM Transactions on Graphics (2003)

    Google Scholar 

  6. Akenine-Möller, T., Ström, J.: Graphics for the masses: a hardware rasterization architecture for mobile phones. ACM Transaction on Graphics (2003)

    Google Scholar 

  7. Sheaffer, J.W., et al.: A Flexible Simulation Framework for Graphics Architectures. Graphics Hardware (2004)

    Google Scholar 

  8. Olano, M., Greer, T.: Triangle Scan Conversion using 2D Homogeneous Coordinates. Graphics Hardware (2000)

    Google Scholar 

  9. McCool, M.D., et al.: Incremental and Hierarchical Hilbert Order Edge Equation Polygon Rasterization. In: Proceedings Graphics Hardware 2001 (2001)

    Google Scholar 

  10. Green, N., et al.: Hierarchical Z-Buffer Visibility. In: Proceedings of SIGGRAPH 1993 (1993)

    Google Scholar 

  11. Morein, S.: ATI Radeon Hyper-z Technology. In: Hot3D Proceedings - Graphics Hardware Workshop (2000)

    Google Scholar 

  12. Stanford University CS488a Fall 2001 Real-Time Graphics Architecture. Kurt Akeley, Path Hanrahan

    Google Scholar 

  13. Igesund, L.I., Stavang, M.H.: Fixed function pipeline using vertex programs, November 22 (2002)

    Google Scholar 

  14. Microsoft Meltdown 2003, DirectX Next Slides, http://www.microsoft.com/downloads/details.aspx?FamilyId=3319E8DA-6438-4F05-8B3D-51083DC25E6&displaylang=en

  15. ARB Vertex Program extension: http://oss.sgi.com/projects/ogl-sample/registry/ARB/vertex_program.txt

  16. ARB Fragment Program extension: http://oss.sgi.com/projects/ogl-sample/registry/ARB/fragment_program.txt

  17. Volumetric Lighting II. Humus 3D demos: http://www.humus.ca/

  18. PowerVR MBX, http://www.powervr.com/Products/Graphics/MBX/Index.asp

  19. ATI Imageon 2300, http://www.ati.com/products/imageon2300/features.html

  20. NVidia GoForce 4800, http://www.nvidia.com/page/goforce_3d_4500.html

  21. Bitboys G40 Embedded Graphic Processor. Hot3D presentations, Graphics Hardware (2004)

    Google Scholar 

  22. Falanx Microsystems. Image Quality no Compromise. Hot3D presentations, Graphics Hardware (2004)

    Google Scholar 

  23. Moya, V., Gonzalez, C., Roca, J., Fernandez, A., Espasa, R.: Shader Performance Analysis on a Modern GPU Architecture. IEEE Micro-38 (2005)

    Google Scholar 

  24. GPUBench, http://graphics.stanford.edu/projects/gpubench/

  25. Fatahalian, K., Sugerman, J., Hanrahan, P.: Understanding the Efficiency of GPU Algorithms for Matrix-Matrix Multiplication. Graphics Hardware (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Moya, V., González, C., Roca, J., Fernández, A., Espasa, R. (2005). A Single (Unified) Shader GPU Microarchitecture for Embedded Systems. In: Conte, T., Navarro, N., Hwu, Wm.W., Valero, M., Ungerer, T. (eds) High Performance Embedded Architectures and Compilers. HiPEAC 2005. Lecture Notes in Computer Science, vol 3793. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11587514_19

Download citation

  • DOI: https://doi.org/10.1007/11587514_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30317-6

  • Online ISBN: 978-3-540-32272-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics