Skip to main content
Log in

OpenMDSP: Extending OpenMP to Program Multi-Core DSPs

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Multi-core digital signal processors (DSPs) are widely used in wireless telecommunication, core network transcoding, industrial control, and audio/video processing technologies, among others. In comparison with general-purpose multi-processors, multi-core DSPs normally have a more complex memory hierarchy, such as on-chip core-local memory and non-cache-coherent shared memory. As a result, efficient multi-core DSP applications are very difficult to write. The current approach used to program multi-core DSPs is based on proprietary vendor software development kits (SDKs), which only provide low-level, non-portable primitives. While it is acceptable to write coarse-grained task-level parallel code with these SDKs, writing fine-grained data parallel code with SDKs is a very tedious and error-prone approach. We believe that it is desirable to possess a high-level and portable parallel programming model for multi-core DSPs. In this paper, we propose OpenMDSP, an extension of OpenMP designed for multi-core DSPs. The goal of OpenMDSP is to fill the gap between the OpenMP memory model and the memory hierarchy of multi-core DSPs. We propose three classes of directives in OpenMDSP, including 1) data placement directives that allow programmers to control the placement of global variables conveniently, 2) distributed array directives that divide a whole array into sections and promote the sections into core-local memory to improve performance, and 3) stream access directives that promote big arrays into core-local memory section by section during parallel loop processing while hiding the latency of data movement by the direct memory access (DMA) of a DSP. We implement the compiler and runtime system for OpenMDSP on FreeScale MSC8156. The benchmarking results show that seven of nine benchmarks achieve a speedup of more than a factor of 5 when using six threads.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Karam L, AlKamal I, Gatherer A, Frantz G, Anderson D, Evans B. Trends in multicore DSP platforms. Signal Processing Magazine, IEEE, 2009, 26(6): 38-49.

    Article  Google Scholar 

  2. Zyren J. Overview of the 3GPP long term evolution physical layer, 2007. http://www.freescale.com/files/wireless_comm/doc/white_paper/3GPPEVOLUTIONWP.pdf, Nov. 2013.

  3. Reid A D, Flautner K, Grimley-Evans E, Lin Y. SoC-C: Efficient programming abstractions for heterogeneous multicore systems on chip. In Proc. the 2008 CASES, October 2008, pp.95-104.

  4. Thies W, Karczmarek M, Amarasinghe S. StreamIt: A language for streaming applications. In Proc. Int. Conf. Compiler Construction, April 2002, pp.179-196.

  5. Liao C, Hernandez O, Chapman B, Chen W, Zheng W. OpenUH: An optimizing, portable OpenMP compiler: Research Articles. Concurrency and Computation: Practice & Experience, 2007, 19(18): 2317-2332.

    Article  Google Scholar 

  6. Dave C, Bae H, Min S, Lee S, Eigenmann R, Midkiff S. Cetus: A source-to-source compiler infrastructure for multicores. Computer, 2009, 42(11): 36-42.

    Article  Google Scholar 

  7. Parr T, Quong R. ANTLR: A predicated-LL(k) parser generator. Software - Practice & Experience, 1995, 25(7): 789-810.

    Article  Google Scholar 

  8. Tian X, Girkar M, Shah S et al. Compiler and runtime support for running OpenMP programs on Pentium- and Itanium-architectures. In Proc. the 17th Parallel and Distributed Processing Symposium, April 2003, pp.9-18.

  9. Müller M S. Some simple OpenMP optimization techniques. In Lecture Notes in Computer Science 2104, Eigenmann R, Voss M, (eds.), Springer, 2001, pp.31-39.

  10. Tian X, Girkar M, Bik A, Saito H. Practical compiler techniques on efficient multithreaded code generation for OpenMP programs. Computer Journal, 2005, 48(5): 588-601.

    Article  Google Scholar 

  11. Chapman B M, Huang L. Enhancing OpenMP and its implementation for programming multicore systems. In Proc. Parallel Computing: Architectures, Algorithms and Applications, September 2007, pp.3-18.

  12. O’Brien K, O’Brien K M, Sura Z et al. Supporting OpenMP on cell. Int. J. Parallel Programming, 2008, 36(3): 289-311.

    Article  MATH  Google Scholar 

  13. Wei H, Yu J. Loading OpenMP to Cell: An effective compiler framework for heterogeneous multi-core chip. In Proc. the 3rd International Workshop on OpenMP, June 2007, pp.129-133.

  14. Lee S, Min S, Eigenmann R. OpenMP to GPGPU: A compiler framework for automatic translation and optimization. In Proc. the 14th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, Feb. 2009, pp.101-110.

  15. Lee S, Eigenmann R. OpenMPC: Extended OpenMP programming and tuning for GPUs. In Proc. the 2010 Conf. High Performance Computing Networking, Storage and Analysis, Nov. 2010.

  16. Liu F, Chaudhary V. Extending OpenMP for heterogeneous chip multiprocessors. In Proc. the 32nd International Conference on Parallel Processing, October 2003, pp.161-168.

  17. Liu F, V. Chaudhary. A practical OpenMP compiler for system on chips. In Lecture Notes in Computer Science 2716, Voss M (ed.), Springer, 2003, pp.54-68.

  18. Kimura K, Mase M, Mikami H et al. OSCAR API for real-time low-power multicores and its performance on multicores and SMP servers. In Lecture Notes in Computer Science 5898, Gao G, Pollock L, Cavazos J, Li X (eds.), Springer, 2009, pp.188-202.

  19. Hayashi A, Wada Y, Watanabe T et al. Parallelizing compiler framework and API for power reduction and software productivity of real-time heterogeneous multicores. In Lecture Notes in Computer Science 6548, Cooper K, Mellor-Crummey J, Sarkar V (eds.), Springer, 2010, pp.184-198.

  20. Leupers R, Castrillón J. MPSoC programming using the MAPS compiler. In Proc. the 15th Asia and South Pacific Design Automation Conference, January 2010, pp.897-902.

  21. Kwon S, Kim Y, Jeun W, Ha S, Paek Y. A retargetable parallel-programming framework for MPSoC. ACM Trans. Design Autom. Electr. Syst., 2008, 13(3): Article No.39.

  22. Kennedy K, Koelbel C, Zima H P. The rise and fall of High Performance Fortran: An historical object lesson. In Proc. the 3rd ACM SIGPLAN Conf. History of Programming Languages, June 2007, Article No. 7.

  23. El-Ghazawi T, Carlson W, Sterling T et al. UPC: Distributed Shared Memory Programming. Wiley-Interscience, 2003.

  24. Numrich R W, Reid J. Co-array Fortran for parallel programming. ACM Fortran Forum, 1998, 17(2): 1-31.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiang-Zhou He.

Additional information

This work was supported by the National High Technology Research and Development 863 Program of China under Grant No. 2012AA010901 and the National Natural Science Foundation of China under Grant No. 61103021.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(DOC 24 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, JZ., Chen, WG., Chen, GR. et al. OpenMDSP: Extending OpenMP to Program Multi-Core DSPs. J. Comput. Sci. Technol. 29, 316–331 (2014). https://doi.org/10.1007/s11390-014-1433-x

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-014-1433-x

Keywords

Navigation