skip to main content
research-article

GPU-like on-chip system for decoding LDPC codes

Published:10 March 2014Publication History
Skip Abstract Section

Abstract

Rapid prototyping is an important step in the development and the verification of computationally demanding tasks of digital communication systems, such as Forward Error Correction (FEC) decoding. The goal is to replace time-consuming simulations based on abstract models of the system with real-time experiments under real-world conditions. GPU-like architecture is a promising approach to fully exploit the potential of FPGA-based acceleration platforms. In this article, an application-specific GPU-like architecture and a complete compilation framework for decoding LDPC codes are proposed. The interest in an application-specific GPU in comparison with current GPUs is detailed. Finally, real-time experimentations demonstrate the potential of the GPU-like decoder to investigate both algorithmic and architectural issues.

References

  1. Cheng-Chun Chang, Yang-Lang Chang, Min-Yu Huang, and Bormin Huang. 2011. Accelerating Regular LDPC code decoders on GPUS. IEEE J. Select. Topics Appl. Earth Observ. Remote Sens. 4, 3, 653--659.Google ScholarGoogle ScholarCross RefCross Ref
  2. J. Chen and M. Fossorier. 2002. Density evolution of two improved BP-based algorithms for LDPC decoding. IEEE Commun. Lett. 6, 5, 208--210.Google ScholarGoogle ScholarCross RefCross Ref
  3. G. Falcao, J. Andrade, V. Silva, and L. Sousa. 2011a. GPU-based DVB-S2 LDPC decoder with high throughput and fast error floor detection. Electron. Lett. 47, 9, 542--543.Google ScholarGoogle ScholarCross RefCross Ref
  4. G. Falcao, J. Andrade, V. Silva, S. Yamagiwa, and L. Sousa. 2013. Stressing the BER simulation of LDPC codes in the error floor region using GPU clusters. In Proceedings of the International Symposium Wireless Communication System (ISWCS).Google ScholarGoogle Scholar
  5. G. Falcao, L. Sousa, and V. Silva. 2011a. Massively LDPC Decoding on Multicore Architectures. IEEE Trans. Parallel Distrib. Syst. 22, 2, 309--322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. Falcao, S. Yamagiwaand, V. Silva, and L. Sousa. 2009. Parallel LDPC decoding on GPUs using a stream-based computing approach. J. Comput. Sci. Technol. 24, 5, 913--924. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. P. C. Fossorier, M. Mihaljevic, and H. Imai. 1999. Reduced complexity iterative decoding of low-density parity check codes based on belief propagation. IEEE Trans. Commun. 47, 5, 673--680.Google ScholarGoogle ScholarCross RefCross Ref
  8. R. G. Gallager. 1962. Low density parity check codes. IRE Trans. Inf. Theory 8, 1, 21--28.Google ScholarGoogle ScholarCross RefCross Ref
  9. F. Guilloud, E. Boutillon, and J. L. Danger. 2003. λ-min decoding algorithm of regular and irregular LDPC codes. In Proceedings of the 3rd International Symposium on Turbo Codes and Related Topics. 451--454.Google ScholarGoogle Scholar
  10. F. Guilloud, E. Boutillon, J. Tousch, and J. L. Danger. 2007. Generic description and synthesis of LDPC Decoders. IEEE Trans. Commun. 55, 11, 2084--2091.Google ScholarGoogle ScholarCross RefCross Ref
  11. D. E. Hocevar. 2004. A reduced complexity decoder architecture via layered decoding of LDPC codes. In Proceedings of the IEEE Workshop on Signal Processing Systems (SIPS'04). 107--112.Google ScholarGoogle ScholarCross RefCross Ref
  12. H. Ji, J. Cho, and W. Sung. 2011. Memory access optimized implementation of cyclic and quasi-cyclic LDPC codes on a GPGPU. J. Signal Process. Syst. 64, 1, 149--159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Jones, E. Valles, M. Smith, and J. Villasenor. 2003. Approximate-min* constraint node updating for LDPC code decoding. In Proceedings of the IEEE Military Communication Conference. 157--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Soonyoung Kang and Jaekyun Moon. 2012. Parallel LDPC decoder implementation on GPU based on unbalanced memory coalescing. In Proceedings of the IEEE International Conference on Communications (ICC'12). 3692--3697.Google ScholarGoogle ScholarCross RefCross Ref
  15. F. R. Kschischang, B. J. Frey, and H. A. Loeliger. 2001. Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 47, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Marchand, L. Conde-Canencia, and E. Boutillon. 2011. Architecture and finite precision optimization for layered LDPC decoders. J. Signal Process. Syst. 65, 2, 185--197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. J. Martínez-Zaldívar, A. M. Vidal-Maciá, A. Gonzalez, and V. Almenar. 2011. Tridimensional block multiword LDPC decoding on GPUs. J. Supercomput. 58, 3, 314--322.Google ScholarGoogle ScholarCross RefCross Ref
  18. P. Murugappa, J. Bazin, A. Baghdadi, and M. Jezequel. 2012. FPGA prototyping and performance evaluation of multi-standard turbo/LDPC encoding and decoding. In Proceedings of the 23rd IEEE International Symposium on Rapid System Prototyping (RSP). 143--148.Google ScholarGoogle Scholar
  19. J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Kruger, A. E. Lefohn, and T. J. Purcell. 2007. A survey of general-purpose computation on graphics hardware. Comput. Graphics Forum 26, 1, 80--113.Google ScholarGoogle ScholarCross RefCross Ref
  20. Merve Peyic, Hakan Baba, Erdem Guleyuboglu, Ilker Hamzaoglu, and Mehmet Keskinoz. 2012. A low power multi-rate decoder hardware for IEEE 802.11n LDPC codes. Microprocess. Microsyst. 36, 3, 159--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Steve Rhoads. 2011. Plasma 32-bit softcore. Tech. rep. http://www.plasmacpu.no-ip.org.Google ScholarGoogle Scholar
  22. Kyung-Wook Shin and Hae-Ju Kim. 2012. A Multi-mode LDPC decoder for IEEE 802.16e mobile WiMAX. J. Semiconduct. Technol. Sci. 12, 1, 24--33.Google ScholarGoogle ScholarCross RefCross Ref
  23. Guohui Wang, Michael Wu, Yang Sun, and Joseph R. Cavallaro. 2011a. GPU accelerated scalable parallel decoding of LDPC codes. In Proceedings of the IEEE Asilomar Conference on Signals, Systems, and Computers. 2053--2057.Google ScholarGoogle Scholar
  24. G. Wang, M. Wu, Y. Sun, and J. R. Cavallaro. 2011b. A massively parallel implementation of QC-LDPC decoder on GPU. In Proceedings of the 9th IEEE Symposium on Application Specific Processors. 82--85. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. G. Wang, M. Wu, B. Yin, and J. R. Cavallaro. 2013. High throughput low latency LDPC decoding on GPU for SDR systems. In Proceedings of the IEEE Global Conference on Signal and Information Processing (GlobalSIP).Google ScholarGoogle Scholar
  26. S. Wang, S. Cheng, and Q. Wu. 2008. A parallel decoding algorithm of LDPC codes using CUDA. In Proceedings of the 42nd Asilomar Conference on Signals Systems and Computers. 171--175.Google ScholarGoogle Scholar
  27. Chen Xiaoheng, Jingyu Kang, Shu Lin, and Venkatesh Akella. 2011. Memory system optimization for FPGA-based implementation of quasi-cyclic LDPC codes decoders. IEEE Trans. Circuits Syst. I: Regular Papers 58, 1, 98--111.Google ScholarGoogle ScholarCross RefCross Ref
  28. Yan Ying, Kaidi You, Liyang Zhou, Heng Quan, and Xiaoyang Zeng. 2012. A pure software LDPC decoder on a multi-core processor platform with reduced inter-processor communication cost. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS). 2609--2612.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. GPU-like on-chip system for decoding LDPC codes

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Embedded Computing Systems
        ACM Transactions on Embedded Computing Systems  Volume 13, Issue 4
        Regular Papers
        November 2014
        647 pages
        ISSN:1539-9087
        EISSN:1558-3465
        DOI:10.1145/2592905
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 10 March 2014
        • Accepted: 1 October 2013
        • Revised: 1 May 2013
        • Received: 1 February 2013
        Published in tecs Volume 13, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader