Skip to main content

Large Matrix Multiplication on a Novel Heterogeneous Parallel DSP Architecture

  • Conference paper
Advanced Parallel Processing Technologies (APPT 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5737))

Included in the following conference series:

Abstract

This paper introduces a novel master-multi-SIMD on-chip multi-core architecture for embedded signal processing. The parallel architecture and its memory subsystem are described in this paper. We evaluate the large size matrix multiplication performance on this parallel architecture and compare it with a SIMD-extended data parallel architecture. We also examine how well the new architecture scales for different numbers of SIMD co-processors. The experimental results show that the ePUMA architecture’s memory subsystem can effectively hide the data access overhead. With its 8-way SIMD data path and multi-SIMD parallel execution, the ePUMA architecture improves the performance of matrix multiplication with a speedup of 45x from the conventional SIMD extension.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Liu, D.: Embedded DSP Processor Design, ch. 20. Morgen-Kaufmann, Linköping (2008)

    Google Scholar 

  2. ARM Media Extensions, http://www.arm.com/products/CPUs/arch-simd.html

  3. Tyler, J., Lent, J., Mather, A., Nauyen, H.: AltiVecTM: Bringing Vector Technology to the PowerPCTM Processor Family. In: IEEE International IPCCC 1999, February 10-12, pp. 437–444 (1999)

    Google Scholar 

  4. Kumura, T., Ikekawa, M., Yosbida, M., Kuroda, I.: VLIW DSP for mobile applications. IEEE Signal Processing Magazine 19(4), 10–21 (2002)

    Article  Google Scholar 

  5. Chang, H., Cho, J., Sung, W.: Performance Evaluation of an SIMD Architecture with a Multi-bank Vector Memory Unit. IEEE SIPS, Banff, 71–76 (2006)

    Google Scholar 

  6. Weiss, M., Fettweis, G.: Dynamic Codewidth Reduction for VLIW Instruction Set Architectures in Digital Signal Processors. In: 3rd International Workshop on Image ana’ Signal Processing, pp. 517–520 (1996)

    Google Scholar 

  7. Ainsworth, T.W., Pinkston, T.M.: Characterizing The Cell Eib On-Chip Network. IEEE Micro 27(5), 6–14 (2007)

    Article  Google Scholar 

  8. Gössel, M., Rebel, B., Creutzburg, R.: Memory Architecture and Parallel Access. Elsevier Science, Amsterdam (1994)

    MATH  Google Scholar 

  9. Lundgren, B., Ödlund, A.: Expose of patterns in parallel memory access. Master thesis, Linköping university, LiTH-ISY-EX–07/4005-SE

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sohl, J., Wang, J., Liu, D. (2009). Large Matrix Multiplication on a Novel Heterogeneous Parallel DSP Architecture. In: Dou, Y., Gruber, R., Joller, J.M. (eds) Advanced Parallel Processing Technologies. APPT 2009. Lecture Notes in Computer Science, vol 5737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03644-6_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03644-6_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03643-9

  • Online ISBN: 978-3-642-03644-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics