Language-Extension-Based Vectorizing Compiling Scheme on SDR-DSP

Ni, Xiaoqiang; Yang, Liu; Ma, Chiyuan

doi:10.1007/978-981-10-3159-5_2

Xiaoqiang Ni¹⁵,
Liu Yang¹⁵ &
Chiyuan Ma¹⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 666))

Included in the following conference series:

CCF National Conference on Computer Engineering and Technology

572 Accesses

Abstract

In this paper we propose a Language-Extension-based Vectorizing Compiling Scheme (LEVCS) for a newly developed DSP. The DSP is mainly designed for Software-Defined Radio (SDR) and is called SDR-DSP. The SDR-DSP architecture mixes the styles of VLIW (Very Long Instruction Word) and SIMD (Single Instruction Multiple Data). To explore the potential of SDR-DSP and achieve high performance, vectorization is one of the must equipped critical methods. Because auto-vectorization techniques cannot satisfy the requirements of the typical application, LEVCS is used to direct the vectorization. The C-extending programming language used in LEVCS is called SDR-DSP-C. LEVCS uses flexible data reorganization to make vectorization on SDR-DSP more efficient. We use LEVCS to vectorize five benchmark kernels: Fast Fourier Transform (FFT), Finite Impulse Responsefilter (FIR) and Infinite Impulse Response filter (IIR), Dot product implementation (Dotprod), Sum of vectors (vecsum). Experiment results show that LEVCS is functional correct and can achieve 2.883–8.074 speedups comparing to TI-DSPs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Harada, H., Kuroda, M., Morikawa, H., Wakana, H., Adachi, F.: The overview of the new generation mobile communication system and the role of software defined RADIO Technology. IEICE Trans. Commun. E86-B(12), 3374–3384 (2003)
Google Scholar
Jo, G.-D., Sheen, M.-J., Lee, S.-H., Cho, K.-R.: ADSP-Based reconfigurable SDR platform for 3G systems. IEICE Trans. Commun. E88-B(2), 678–686 (2005)
Article Google Scholar
Wally, H.W.: Tuttlebee: Software Defined Radio: Enabling Technologies. Wiley, Chichester (2002)
Google Scholar
He, X., Jin, X., Wang, M., Zhou, D., Goto, S.: A 98 GMACs/W 32-core vector processor in 65 nm CMOS. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E94-A(12), 2609–2618 (2011)
Article Google Scholar
Tanaka, H., Takeuchi, Y., Sakanushi, K., Mai, M., Tagawa, H., Ota, Y., Matsumoto, N.: Generation of pack instruction sequence for media processors using multi-valued decision diagram. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E90-A(12), 2800–2809 (2007)
Article Google Scholar
Fisher, J.: Very long instruction word architectures and the ELI-512. In: Proceedings of the Tenth Annual International Symposium on Computer Architecture, pp. 140–150 (1983)
Google Scholar
Lorenz M,, Wehmeyer L, Drager T.: Energy aware compilation for DSPs with SIMD instructions. In: Proceedings of the Joint Conference on Languages, Compilers and Tools for Embedded Systems, LCTES/SCOPES 2002, pp. 94–101. ACM Press (2002)
Google Scholar
Gardner, J.S.: CEVA exposes DSP six pack XC4000 family uses coprocessors to buff up the baseband. The Linley Group, Microprocessor Report, March 2012
Google Scholar
CEVA-XC4000. CEVA, Inc. (2012). http://www.ceva-dsp.com/CEVA-XC4000.html
Balaish, E.: Architecture oriented C optimizations, White Paper, CEVA, Inc., January 2010
Google Scholar
Balaish, E.: Combining C code with assembly code in DSP applications. White Paper, CEVA, Inc., August 2009
Google Scholar
Texas Instruments: TMS320C6678 Multicore Fixed and Floating-Point Digital Signal Processor Data Manual. SPRS691C, February 2012
Google Scholar
Texas Instruments: TMS320C6000 optimizing compiler v7.3 user’s guide. SPRU187T, July 2011
Google Scholar
Maleki, S., Gao, Y., Jess Garzar¢n, M., Wong, T., Padua, D.A.: An evaluation of vectorizing compilers. In: PACT, pp. 372–382 (2011)
Google Scholar
Jung, Y., Yoon, H., Kim, J.: New efficient FFT algorithm and pipeline implementation results for OFDM/DMT applications. IEEE Trans. Consum. Electron. 49(1), 14–20 (2003)
Article Google Scholar
Bouknight, W.J., Denenberg, S.A., McIntyre, D.E., Randall, J.M., Sameh, A.H., Slotnick, D.L.: The Illiac IV system. In: Proceedings of the IEEE, vol. 60, no. 4, pp. 369–388, April 1972
Google Scholar
Swoop, P., Schmittler, J.: RPU: a programmable ray processing unit for realtime ray tracing. ACM Trans. Graph. 24(3), 434–444 (2005)
Article Google Scholar
Lee, Y., Avizienis, R., Bishara, A., et al.: Exploring the tradeoffs between programmability and efficiency in data-parallel accelerators. In: Proceedings of the IEEE International Symposium on Computer Architecture, San Jose, USA, pp. 129–140, June 2011
Google Scholar
Krashinsky, R. Hampton, M., Gerding, S., Batten, C.: The vector-thread architecture, In: Proceedings of the IEEE International Symposium on Computer Architecture, Saint-Malo, France, pp. 37–48, June 2004
Google Scholar
Fung, W.W.L., Sham, I., Yuan, G., et al.: Dynamic Warp Formation And Scheduling For efficient GPU control flow. In: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, pp. 407–420 (2007)
Google Scholar
Fung, W.W.L., Sham, I., Yuan, G., et al.: Dynamic warp formation: efficient MIMD control flow on SIMD graphics hardware. ACM Trans. Archit. Code Optim. 6(2), 1544–3566 (2009)
Article Google Scholar
Wang, Y., Chen, S., Zhang, K., Wan, J., Xiaowen Chen, H., Chen, H.W.: Instruction shuffle: achieving MIMD-like performance on SIMD architectures. IEEE Comput. Archit. Lett. 11(2), 37–40 (2012)
Article Google Scholar
Kapasi, U., Dally, W.J., Rixner, S., et al.: Efficient conditional operations for data-parallel architectures. In: Proceedings of the 33rd Annual ACM/IEEE International Symposium on Microarchitecture,pp. 159–170. ACM, NewYork (2000)
Google Scholar
Texas Instruments: TMS320C6000 Optimizing Compiler v7.4 User’s Guide, SPRU187 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer, National University of Defense Technology, Deya Street 109, Changsha, 410073, People’s Republic of China
Xiaoqiang Ni, Liu Yang & Chiyuan Ma

Authors

Xiaoqiang Ni
View author publications
You can also search for this author in PubMed Google Scholar
Liu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Chiyuan Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoqiang Ni .

Editor information

Editors and Affiliations

National University of Defense Technology, Changsha, China
Weixia Xu
National University of Defense Technology, Changsha, China
Liquan Xiao
National University of Defense Technology, Changsha, China
Jinwen Li
National University of Defense Technology, Changsha, China
Chengyi Zhang
National University of Defense Technology, Changsha, China
Zhenzhen Zhu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ni, X., Yang, L., Ma, C. (2016). Language-Extension-Based Vectorizing Compiling Scheme on SDR-DSP. In: Xu, W., Xiao, L., Li, J., Zhang, C., Zhu, Z. (eds) Computer Engineering and Technology. NCCET 2016. Communications in Computer and Information Science, vol 666. Springer, Singapore. https://doi.org/10.1007/978-981-10-3159-5_2

Download citation

DOI: https://doi.org/10.1007/978-981-10-3159-5_2
Published: 09 December 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3158-8
Online ISBN: 978-981-10-3159-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)