Skip to main content
Log in

Abstract

In several digital signal processing algorithms, computational nodes are organized in consecutive stages and data is reordered between these stages. Parallel computation of such algorithms with reduced number of processing elements implies that several computational nodes are assigned to each element. As a drawback, permutations become more complex and require data storage. In this paper, a systematic design methodology for stride permutation networks is derived. These permutations are represented with Boolean matrices, which are decomposed and mapped directly onto register-based networks. The resulting networks are regular and scalable and they support any stride of power-of-two. In addition, the networks reach the lower bound in the number of registers indicating area-efficiency. Since the proposed methodology is systematic, it can be exploited in automated design generation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J. H. Takala, T. S. Järvinen, and H. T. Sorokin, “Conflict-free Parallel Memory Access Scheme for FFT Processors,” Proc. IEEE ISCAS, Bangkok, Thailand, May 25–28, 2003.

  2. J. A. Hidalgo, J. Lopez, F. Argüello, and E. L. Zapata, “Area-efficient Architecture for Fast Fourier Transform,” IEEE Trans. Circuits Syst. II, vol. 46, no. 2, 1999, pp. 187–193 (Feb).

    Article  Google Scholar 

  3. S. Y. Kim, H. Kim, and I. C. Park, “Path Metric Memory Management for Minimising Interconnections in Viterbi Decoders,” IEEE Elect. Lett., vol. 37, no. 14, 2001, pp. 925–926 (July).

    Article  Google Scholar 

  4. H. S. Stone, “Parallel Processing with the Perfect Shuffle,” IEEE Trans. Comput., vol. C-20, no. 2, 1971, pp. 153–161 (Feb.).

    Article  Google Scholar 

  5. J. Granata, M. Conner, and R. Tolimieri, “Recursive Fast Algorithms and the Role of the Tensor Product,” IEEE Trans. Signal Process., vol. 40, no. 12, 1992, pp. 2921–2930 (Dec.).

    Article  MATH  Google Scholar 

  6. J. Astola and D. Akopian, “Architecture-oriented Regular Algorithms for Discrete Sine and Cosine Transforms,” IEEE Trans. Signal Process., vol. 47, no. 4, 1999, pp. 1109–1124 (Apr.).

    Article  MATH  Google Scholar 

  7. J. Kwak and J. You, “One- and Two-dimensional Constant Geometry Fast Cosine Transform Algorithms and Architectures,” IEEE Trans. Signal Process., vol. 47, no. 7, 1999, pp. 2023–2034 (July).

    Article  MathSciNet  MATH  Google Scholar 

  8. J. Takala, T. Järvinen, P. Salmela, and D. Akopian, “Multi-port Interconnection Networks for Radix-r Algorithms,” in Proc. IEEE ICASSP, Salt Lake City, UT, U.S.A., May 7–11 2001, pp. 1177–1180.

  9. J. Takala and T. Järvinen, “Multi-port Interconnection Networks for Matrix Transpose,” in Proc. IEEE ICASSP, Phoenix, AZ, U.S.A., May 26–29 2002, pp. 874–877.

  10. T. Järvinen and J. Takala, “Register-based Permutation Networks for Stride Permutations,” in Computer Systems: Architectures, Modeling, and Simulation (LNCS 3133), A. D. Pimentel and S. Vassiliadis (Eds.), Springer, Berlin Heidelberg New York, 2004, pp. 108–117.

    Chapter  Google Scholar 

  11. T. Järvinen, P. Salmela, H. Sorokin, and J. Takala, “Stride Permutation Networks for Array Processors,” in Proc. IEEE 15th International Conference on Application-Specific Systems, Architectures, and Processors, Galveston, TX, U.S.A., Sept. 27–29 2004.

  12. T. Järvinen, “Systematic Methods for Designing Stride Permutation Interconnections,” Dr. Tech. dissertation, Tampere University of Technology, Tampere, Finland, Nov. 2004. Available online at http://webhotel.tut.fi/library/tutdiss/pdf/jarvinen.pdf.

  13. E. Bidet, D. Castelain, C. Joanblang, and P. Senn, “A Fast Single-chip Implementation of 8192 Complex Point FFT,” IEEE J. Solid-State Circuits, vol. 30, no. 3, 1995, pp. 300–305 (Mar.).

    Article  Google Scholar 

  14. M. Bóo, F. Argüello, J. Bruguera, R. Doallo, and E. Zapata, “High-performance VLSI Architecture for the Viterbi Algorithm,” IEEE Trans. Commun., vol. 45, no. 2, 1997, pp. 168–176 (Feb.).

    Article  Google Scholar 

  15. C. B. Shung, H.-D. Lin, R. Cypher, P. H. Siegel, and H. K. Thapar, “Area-efficient Architectures for the Viterbi Algorithm Part I: Theory,” IEEE Trans. Commun., vol. 41, no. 4, 1993, pp. 636–644 (Apr.).

    Article  MATH  Google Scholar 

  16. K. K. Parhi, “Systematic Synthesis of DSP Data Format Converters Using Life-time Analysis and Forward–backward Register Allocation,” IEEE Trans. Circuits Syst. II, vol. 39, no. 7, 1992, pp. 423–440 (Jul.).

    Article  Google Scholar 

  17. M. Majumdar and K. K. Parhi, “Design of Data Format Converters Using Two-dimensional Register Allocation,” IEEE Trans. Circuits Syst. II, vol. 45, no. 4, 1998, pp. 504–508 (Apr.).

    Article  MATH  Google Scholar 

  18. J. Bae and V. K. Prasanna, “Synthesis of Area-efficient and High-throughput Rate Data Format Converters,” IEEE Trans. VLSI Syst., vol. 6, no. 4, 1998, pp. 697–706 (Dec.).

    Article  Google Scholar 

  19. K. Srivatsan, C. Chakrabarti, and L. Lucke, “A New Register Allocation Scheme for Low-power Data Format Converters,” IEEE Trans. Circuits Syst. II, vol. 46, no. 9, 1999, pp. 1250–1253 (Sept.).

    Article  Google Scholar 

  20. M. Davio, “Kronecker Products and Shuffle Algebra,” IEEE Trans. Comput., vol. 30, no. 2, 1981, pp. 116–125 (Feb.).

    Article  MathSciNet  Google Scholar 

  21. J. C. Carlach, P. Penard, and J. L. Sicre, “TCAD: a 27 MHz 8 × 8 Discrete Cosine Transform Chip,” in Proc. IEEE ICASSP, Glasgow, UK, May 23–26 1989, pp. 2429–2432.

  22. M. Kovac and P. Ranganathan, “JAGUAR: A High Speed VLSI Chip for JPEG Image Compression Standard,” in Proc. IEEE Int. Conf. on VLSI Design, New Delhi, India, Jan. 4–7 1995, pp. 220–224.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tuomas Järvinen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Järvinen, T., Salmela, P., Sorokin, H. et al. Stride Permutation Networks for Array Processors. J VLSI Sign Process Syst Sign Im 49, 51–71 (2007). https://doi.org/10.1007/s11265-006-0031-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-006-0031-8

Keywords

Navigation