Abstract
In recent years embedded systems have entered the multicore era. As the number of cores keeps growing in embedded systems, it becomes more important to provide programming support which considers embedded system constraints and in the meanwhile helps utilize multicore systems. So far though C still dominates embedded programming, C++ is gaining in importance in parallel programming. It is promising to support C++ for embedded multicore systems. However, embedded systems usually have tight resource budgets, and C++ is commonly considered having huge code size that embedded systems can not afford. Therefore, in this paper we investigate the code size requirement of a C++ library and propose a layered design to provide a code size aware library support. On the other hand, to utilize embedded multicore systems, we employ C++ linguistic features to facilitate embedded multicore programming. With C++, we incorporate high-level abstractions and design patterns into the programming support to enhance low-level programming APIs that can be used to exploit DSPs, SIMD instructions, and DMAs on embedded multicore systems. At last, we evaluate our C++ support with a Blur and a JPEG program. Our result on a dual-DSP platform shows that we can obtain speedups of 3.32 and 3.09 for the Blur and JPEG program, respectively.












Similar content being viewed by others
Notes
From now on, the execution time of the ChD stage is merged into that of the CST stage because it is much smaller than other stages.
References
Bell, D., & Wood, G. (2009). Multicore programming guide, application report SPRAB27A. Texas Instruments.
Chang, D., Lin, T., Wu, C., Lee, J., Chu, Y., Wu, A. (2011). Parallel, Architecture Core (PAC) – the first multicore application processor SoC in Taiwan part I: hardware architecture & software development tools. Journal of Signal Processing Systems, 62(3), 373–382.
Choi, Y., Lin, Y., Chong, N., Mahlke, S., Mudge, T. (2009). Stream compilation for real-time embedded multicore systems. In Code generation and optimization, 2009. CGO 2009. International symposium on (pp. 210–220). Seattle: IEEE.
Embedded C++ Technical Committee (1999). The Embedded C++ specification.
Gregory, K. (2011). Overview and C++ AMP approach. Technical report. Microsoft, Providence.
Hsieh, K., Liu, Y., Wu, P., Chang, S., Lee, J. (2008). Enabling streaming remoting on embedded dual-core processors. In Parallel processing, 2008. ICPP’08. 37th international conference on (pp. 35–42). IEEE: Portland.
Kajmowicz, G. uClibc++: an embedded C++ library.
Kale, L., & Krishnan, S. (1993). Charm++: a portable concurrent object oriented system based on C++. In ACM sigplan notices (Vol. 28, pp. 91–108).
Karam, L., AlKamal, I., Gatherer, A., Frantz, G., Anderson, D., Evans, B. (2009). Trends in multicore DSP platforms. Signal Processing Magazine, IEEE, 26(6), 38–49.
Keutzer, K., & Mattson, T. (2010). A design pattern language for engineering (parallel) software. Intel Technology Journal, 13(4).
Kuan, C.B., & Lee, J.K. (2012). Compiler supports for VLIW DSP processors with SIMD intrinsics. Concurrency and Computation: Practice & Experience, 24(5), 517–532.
Lebak, J., Kepner, J., Hoffmann, H., Rutledge, E. (2005). Parallel VSIPL++: An open standard software library for high-performance parallel signal processing. Proceedings of the IEEE, 93(2), 313–330.
Lee, J., & Gannon, D. (1991). Object oriented parallel programming: experiments and results. In Proceedings of the 1991 ACM/IEEE conference on supercomputing (pp. 273–282). ACM.
Levy, M., & Conte, T. (2009). Embedded multicore processors and systems. Micro, IEEE, 29(3), 7–9.
Lin, Y., Choi, Y., Mahlke, S., Mudge, T., Chakrabarti, C. (2008). A parameterized dataflow language extension for embedded streaming systems. In Embedded computer systems: architectures, modeling, and simulation, 2008. SAMOS 2008. International conference on (pp. 10–17). IEEE.
Lin, Y.C., You, Y.P., Lee, J.K. (2007). PALF: compiler supports for irregular register files in clustered VLIW DSP processors. Concurrency and Computation: practice & Experience, 19(18), 2391–2406.
Linderman, M., Collins, J., Wang, H., Meng, T. (2008). Merge: a programming model for heterogeneous multi-core systems. In ACM SIGOPS operating systems review (Vol. 42, pp. 287–296).
Lu, C.H., Lin, Y.C., You, Y.P., Lee, J.K. (2009). LC-GRFA: global register file assignment with local consciousness for VLIW DSP processors with non-uniform register files. Concurrency and Computation: Practice & Experience, 21(1), 101–114.
Mattson, T., Sanders, B., Massingill, B. (2004). Patterns for parallel programming. Addison-Wesley Professional.
Microsoft: Parallel patterns library 2010.
Newburn, C., So, B., Liu, Z., McCool, M., Ghuloum, A., Toit, S., Wang, Z., Du, Z., Chen, Y., Wu, G. (2011). Intel’s Array Building Blocks: a retargetable, dynamic compiler and embedded language. In Code generation and optimization (CGO), 2011 9th annual IEEE/ACM international symposium on (pp. 224–235). IEEE.
Pankratius, V., Schaefer, C., Jannesari, A., Tichy, W. (2008). Software engineering for multicore systems: an experience report. In Proceedings of the 1st international workshop on multicore software engineering (pp. 53–60). ACM.
Plauger, P. (1997). Embedded C++: an overview. Embedded Systems Programming, 10, 40–53.
Reinders, J. (2007). Intel threading building blocks: outfitting C++ for multi-core processor parallelism. O’Reilly Media, Inc.
Wang, P., Collins, J., Chinya, G., Jiang, H., Tian, X., Girkar, M., Yang, N., Lueh, G., Wang, H. (2007) EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system. ACM SIGPLAN Notices, 42(6), 156–166.
Acknowledgments
This work is supported in part by National Science Council (NSC) under grant no. 101-2220-E-007-001 and 101-2219-E-007-004 and by Ministry of Economic Affairs (MOEA) under grant no. 101-EC-17-A-02-S1-202 in Taiwan.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kuan, CB., Li, JJ., Chen, CK. et al. C++ Support and Applications for Embedded Multicore DSP Systems. J Sign Process Syst 75, 109–122 (2014). https://doi.org/10.1007/s11265-013-0750-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-013-0750-6