Skip to main content
Log in

Applying Data Mapping Techniques to Vector DSPs

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Vector digital signal processors (DSPs) offer a good performance to power consumption ratio. Therefore, they are suitable for mobile devices in software defined radio applications. These vector DSPs require input algorithms with vector operations. The performance of vectorized algorithms to a great extent depends on the distribution of data on vector elements. Traditional algorithms for vectorization focus on the extraction of parallelism from a program; we propose an analysis tool that focuses on the selection of an efficient dynamic data mapping for vector DSPs. We transferred Garcia’s communication parallelism graph (Garcia et al., IEEE Trans Parallel Distrib Syst 12: 416–431, 2001) for distributed memory multiprocessor systems to vector DSPs. By alternating the representation of two-dimensional data distributions and the cost models, we are able to determine a dynamic mapping of data on vector elements on the Embedded Vector Processor (EVP) (van Berkel et al., Proceedings of the 2004 software-defined radio technical conference SDR’04, 2004). Additionally, we propose a new efficient algorithm for processing the graph representation that operates in two steps. We demonstrate the capabilities of our tool by describing the vectorization of some MIMO OFDM algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11

Similar content being viewed by others

References

  1. Garcia, J., Ayguade, E., & Labarta, J. (2001). A framework for integrating data alignment, distribution, and redistribution in distributed memory multiprocessors. IEEE Transactions on Parallel and Distributed Systems, 12, 416–431 (April).

    Article  Google Scholar 

  2. van Berkel, C. H., Heinle, F., Meuwissen, P. P. E., Moerman, K., & Weiss, M. (2004). Vector processing as an enabler for software-defined radio in handsets from 3G+WLAN onwards. In Proceedings of the 2004 software-defined radio technical conference SDR’04. Scottsdale, Arizona, U.S.A. (September).

  3. Rajagopal, S., Rixner, S., & Cavallaro, J. R. (2002). A programmable baseband processor design for software defined radios. In Proceedings of the 45th IEEE midwest symposium on circuits and systems conference, MWSCAS 2002 (pp. 413–416) (August).

  4. Schwoerer, L., & Moerman, K. (2006). Benchmarking MIMO OFDM algorithms on the EVP. In GSPx 2006. Santa Clara, CA, USA (October–November).

  5. Lorenz, M., Marwedel, P., Dräger, T., Fettweis, G., & Leupers, R. (2004). Compiler based exploration of DSP energy savings by SIMD operations. In ASP-DAC ’04: Proceedings of the 2004 conference on Asia South Pacific design automation (pp. 838–841). Piscataway: IEEE.

    Google Scholar 

  6. Russell, R. M. (1978). The CRAY-1 computer system. Communications of the ACM, 21(1), 63–72.

    Article  Google Scholar 

  7. Raman, S. K., Pentkovski, V., & Keshava, J. (2000). Implementing streaming SIMD extensions on the Pentium III processor. IEEE Micro, 20(4), 47–57.

    Article  Google Scholar 

  8. Larsen, S., Rabbah, R., & Amarasinghe, S. (2005). Exploiting vector parallelism in software pipelined loops. In MICRO 38: Proceedings of the 38th annual IEEE/ACM international symposium on microarchitecture (pp. 119–129). Washington, DC: IEEE Computer Society.

    Google Scholar 

  9. Wilson, R. P., French, R. S., Wilson, C. S., Amarasinghe, S. P., Anderson. J. M., et al. (1994). SUIF: An infrastructure for research on parallelizing and optimizing compilers. SIGPLAN Notices, 29(12), 31–37.

    Article  Google Scholar 

  10. Allen, R., & Kennedy, K. (1987). Automatic translation of FORTRAN programs to vector form. ACM Transactions on Programming Languages and Systems, 9(4), 491–542.

    Article  MATH  Google Scholar 

  11. Tarjan, R. (1972). Depth-first search and linear graph algorithms. SIAM Journal on Computing, 1(2), 146–160.

    Article  MATH  MathSciNet  Google Scholar 

  12. Darte, A., & Vivien, F. (1996). On the optimality of Allen and Kennedy’s algorithm for parallelism extraction in nested loops. In Euro-Par ’96: Proceedings of the second international euro-par conference on parallel processing (pp. 379–388). London: Springer.

    Google Scholar 

  13. Glossner, J., & Iancu, D. (2006). The Sandbridge SB3011 SDR platform. In Proceedings of the symposium on trends in communications (SympoTIC06). Bratislava, Slovakia.

  14. Jintukar, S., Glossner, J., Kotlyar, V., & Moudgill, M. (2004). The Sandblaster automatic multithreaded vectorizing compiler. In 2004 global signal processing expo (GSPx) and international signal processing conference (ISPC). Santa Clara, California.

  15. Anderson, J. M., & Lam, M. S. (1993). Global optimizations for parallelism and locality on scalable parallel machines. In PLDI ’93: Proceedings of the ACM SIGPLAN 1993 conference on programming language design and implementation (pp. 112–125). New York: ACM.

    Chapter  Google Scholar 

  16. Ramanujam, J., & Sadayappan, P. (1991). Compile-time techniques for data distribution in distributed memory machines. IEEE Transactions on Parallel and Distributed Systems, 2(4), 472–482.

    Article  Google Scholar 

  17. Ozcan, E., & Onbasioglu, E. (2004). Genetic algorithms for parallel code optimization. In Proceedings of the 2004 IEEE congress on evolutionary computation (pp. 1375–1381). Portland: IEEE (June).

    Chapter  Google Scholar 

  18. Kennedy, K., & Kremer, U. (1998). Automatic data layout for distributed-memory machines. ACM Transactions on Programming Languages and Systems, 20(4), 869–916.

    Article  Google Scholar 

  19. Bixby, R. E., Kennedy, K., & Kremer, U. (1994). Automatic data layout using 0-1 integer programming. In PACT ’94: Proceedings of the IFIP WG10.3 working conference on parallel architectures and compilation techniques (pp. 111–122). Amsterdam: North-Holland.

    Google Scholar 

  20. Garcia, J., Ayguade, E., & Labarta, J. (1996) Dynamic data distribution with control flow analysis. In Supercomputing ’96: Proceedings of the 1996 ACM/IEEE conference on supercomputing (CDROM) (p. 11). Washington, DC: IEEE Computer Society.

    Google Scholar 

  21. Allen, J. R., Kennedy, K., Porterfield, C., & Warren, J. (1983). Conversion of control dependence to data dependence. In POPL ’83: Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on principles of programming languages (pp. 177–189). New York: ACM.

    Chapter  Google Scholar 

  22. Li, J., & Chen, M. (1990). Index domain alignment: Minimizing cost of cross-reference between distributed arrays. In Proc. 3rd symp. frontiers massively computation (October).

  23. Hart, P. E., Nilsson, N. J., & Raphael, B. (1968). A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernetics, 4(2), 100–107.

    Article  Google Scholar 

  24. Pugh, W. (1991). The omega test: A fast and practical integer programming algorithm for dependence analysis. In Supercomputing ’91: Proceedings of the 1991 ACM/IEEE conference on supercomputing (pp. 4–13). New York: ACM.

    Chapter  Google Scholar 

  25. Guo, Y., & McCain, D. (2005). Reduced QRD-M detector in MIMO-OFDM systems with partial and embedded sorting. In Global telecommunications conference (GLOBECOM ’05).

Download references

Acknowledgements

This work has been sponsored in part by the German Federal Ministry of Education and Research within the scope of the Wireless Gigabit With Advanced Multimedia Support (WIGWAM) project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Westermann.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Westermann, P., Schwoerer, L. & Kaufmann, A. Applying Data Mapping Techniques to Vector DSPs. J Sign Process Syst Sign Image Video Technol 57, 57–72 (2009). https://doi.org/10.1007/s11265-008-0170-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-008-0170-1

Keywords

Navigation