Skip to main content

Advertisement

Log in

A fast and energy-efficient Hamming decoder for software-defined radio using graphics processing units

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The demand for scalable and fast error decoders has recently increased in software-defined radio-based communication systems. Hamming code, which is one of the promising error decoders, shows acceptable accuracy; however, the computational complexity of the decoder limits its use in real-time communication. To address this issue, this paper proposes a fully parallel implementation of the (7, 4) Hamming code on a graphics processing unit (GPU) by exploiting massive data-parallelism and increasing on-chip constant memory accesses. To further improve the performance of this proposed parallel approach, this paper explores the impact of different thread/block configurations and selects optimal thread/block configurations, which can occupy more hardware resources for performing parity checks, error detection and correction, and decoding of the received codeword. In addition, the proposed GPU-based Hamming decoder can provide significant scalability by supporting different message sizes, including 355,907 bytes, 2,959,475 bytes, and 12,835,890 bytes. To verify the effectiveness of the GPU-based parallel Hamming decoder, this paper compares its performance with that of the multi-threading central processing unit (CPU) approach which is executed on an Intel multi-core processor. Experimental results indicate that the proposed GPU-based decoder operates at least 15.13 times faster and reduces the energy consumption by up to 913.17 % compared to the multi-threading CPU-based approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Ramacher U (2007) Software-defined radio prospects for multistandard mobile phones. IEEE Comput 40(10):62–69

    Article  Google Scholar 

  2. Yazdi T, Sadegh SM, Cho H, Dolecek L (2013) Gallager B decoder on noisy hardware. IEEE Trans Commun 61(5):1660–1673

    Article  Google Scholar 

  3. Gronroos S, Nybom K, Bjorkqvist J (2011) Complexity analysis of software defined DVB-T2 physical layer. J Analog Integr Circuits Signal Process 69(2–3):131–142

    Article  Google Scholar 

  4. Refaey A, Roy S, Fortier V (2011) A new approach for FEC decoding based on the BP algorithm in LTE and WiMAX systems. In: Proceedings of the 2011 12th IEEE Canadian workshop on information theory, pp 9–14, Kelowna, 17–20 May 2011

  5. Nicollet E, Pucker L (2008) Standardizing transceiver APIs for software defined and cognitive radio. RF design Magazine, pp 16–20

  6. May M, Ilnseher T, When N, Raab W (2010) A 150 Mbit/s 3GPP LTE turbo code decoder. In: Proceedings of the ACM conference on design, automation and test in Europe, pp 1420–1425, Dresden, 8–12 March 2010

  7. Lin Y, Lee H, Who M, Harel Y, Mahlke S, Mudge T, Chakrabarti C, Flautner K (2007) SODA: a high-performance DSP architecture for software-defined radio. IEEE Micro 27(1):114–123

    Article  Google Scholar 

  8. Park JY, Chung KS (2011) Parallel LDPC decoding using CUDA and OpenMP. EURASIP J Wireless Commun Netw 11(1):172–180

    Article  MathSciNet  Google Scholar 

  9. Palkovic M, Raghavan P, Li M, Dejonghe A, Van der Perre L, Catthoor F (2010) Future software-defined radio platforms and mapping flows. IEEE Signal Process Mag 27(2):22–33

    Article  Google Scholar 

  10. Tuttlebee W (2002) Software defined radio: enabling technologies. Wiley, Chichester

    Book  Google Scholar 

  11. Kim J, Hyeon S, Choi S (2010) Implementation of an SDR system using graphics processing unit. IEEE Commun Mag 48(3):156–162

    Article  Google Scholar 

  12. Sundar S, Kumar R, Kittur HM (2014) Implementation of data encoding and decoding in ARM boards with QoS parameters. J Theoret Appl Inf Technol 62(1):148–153

    Google Scholar 

  13. Lee H, Chakrabarti C, Mudge T (2010) A low-power DSP for wireless communications. IEEE Trans Very Large Scale Integr (VLSI) Syst 18(9):1310–1322

    Article  Google Scholar 

  14. Beluch T, Perget F, Henaut J, Dragomirescu D, Plana R (2012) Mostly digital wireless ultra wide band communication architecture for software defined radio. IEEE Microw Mag 13(1):132–138

    Article  Google Scholar 

  15. Alluri VB, Heath JR, Lhamon M (2010) A new multichannel, coherent amplitude modulated, time-division multiplexed, software-defined radio receiver architecture, and field-programmable-gate-array technology implementation. IEEE Trans Signal Process 58(10):5369–5384

    Article  MathSciNet  Google Scholar 

  16. Hu L, Nooshabadi S, Mladenov T (2013) Forward error correction with raptor GF(2) and GF(256) codes on GPU. IEEE Trans Consum Electron 59(1):273–280

    Article  Google Scholar 

  17. Zhao Y, Lau FCM (2014) Implementation of decoders for LDPC block codes and LDPC convolutional codes based on GPUs. IEEE Trans Parallel Distrib Syst 25(3):663–672

    Article  MathSciNet  Google Scholar 

  18. Lin CS, Liu WL, Yeh WT, Chang LW, Hwu WMW (2011) A tiling-scheme Viterbi decoder in software defined radio for GPUs. In: Proceedings of the 2011 7th international conference on wireless communications, networking and mobile computing, pp 1–4, Wuhan, 23–25 September 2011

  19. Falcao G, Silva V, Sousa L (2009) How GPUs can outperform ASICs for fast LDPC decoding. In: Proceedings of the 2009 23rd international conference on supercomputing, pp 390–399, Yorktown Heights, 8–12 June 2009

  20. Li R, Dou Y, Zhou J, Deng L, Wang S (2014) CuSora: real-time software radio using multi-core graphics processing unit. J Syst Archit 60:280–292

    Article  Google Scholar 

  21. Bang S, Ahn C, Jin Y, Choi S, Glossner J, Ahn S (2014) Implementation of LET system on an SDR platform using CUDA and UHD. Analog Integr Circuits Signal Process 78:599–610

    Article  Google Scholar 

  22. Michael W, Yang S, Wang G, Joseph C (2011) Implementation of a high throughput 3GPP turbo decoder on GPU. J Signal Process Syst 65(1):171–183

  23. Martinez-Zaldivar FJ, Vidal-Macia AM, Gonzalez A, Almenar V (2011) Tridimensional block multiword LDPC decoding on GPUs. J Supercomput 58(3):314–322

    Article  Google Scholar 

  24. Li R, Zhou J, Dou Y, Guo S, Zou D, Wang S (2013) A multi-standard efficient column-layered LDPC decoder for software defined radio on GPUs. In: Proceedings of the IEEE 14th workshop on signal processing advances in wireless communications, pp 724–728, Darmstadtium, 16–19 June 2013

  25. Li R, Dou Y, Li Y, Wang S (2013) A fully parallel truncated Viterbi decoder for software defined radio on GPUs. In: Proceedings of the 2013 IEEE wireless communications and networking conference, pp 4305–4310, Shanghai, 7–10 April 2013

  26. Michael W, Yang S, Siddharth G, Joseph C (2011) Implementation of a high throughput soft MIMO detector on GPU. J Signal Process Syst 64(1):123–136

    Article  Google Scholar 

  27. Hamming RW (1950) Error detecting and error correcting codes. Bell Syst Technol J 26(2):147–160

    Article  MathSciNet  Google Scholar 

  28. Xu J, Li K, Min G (2012) Reliable and energy-efficient multipath communications in underwater sensor networks. IEEE Trans Parallel Distrib Syst 23(7):1326–1335

    Article  Google Scholar 

  29. Ma R, Cheng S (2011) The universality of generalized Hamming code for multiple sources. IEEE Trans Commun 59(10):2641–2647

    Article  Google Scholar 

  30. Ali NA, Elsayed HM, El-Soudani M, Amer HH (2011) Effect of Hamming coding on WSN lifetime and throughput. In: Proceedings of the 2011 IEEE international conference on mechatronics, pp 749–754, Istanbul, 13–15 April 2011

  31. Amt AGI, Nour CA, Douillard C (2009) Serially concatenated continuous phase modulation for satellite communications. IEEE Trans Wireless Commun 8(6):3260–3269

    Article  Google Scholar 

  32. Argyrides C, Ferreira RR, Lisboa CA, Carro L (2011) Decimal Hamming: a software-implemented technique to cope with soft errors. In: Proceedings of the 2011 IEEE international symposium on defect and fault tolerance in VLSI and nanotechnology systems, pp 11–17, Vancouver, 3–5 October 2011

  33. Specifications of the Nvidia GeForce GTX 760. http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-760/specifications (online)

  34. CUDA C programming guide (2014). http://docs.nvidia.com/cuda/#axzz3TEvoYilv (online)

  35. Liu Y, Guo L, Li J, Ren M, Li K (2012) Parallel algorithms for approximate string matching with k mismatches on CUDA. In: Proceedings of the 2012 IEEE 26th international parallel and distributed processing symposium workshops and PhD forum, pp 2414–2422, Anchorage, 21–25 May 2012

  36. Sanders J, Kandrot E (2010) CUDA by example: an introduction to general-purpose GPU programming, 1st edn. Addison-Wesley, Boston

  37. Kirk DB, Hwu WW (2010) Programming massively parallel processors: a hands-on approach. Morgan Kaufmann, Burlington

  38. Intel\(^{\textregistered }\) Power Gadget (2014). https://software.intel.com/en-us/articles/intel-power-gadget-20 (online)

Download references

Acknowledgments

This work was supported by the 2014 Research Fund of the University of Ulsan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jong-Myon Kim.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, J., Kang, M., Islam, M.S. et al. A fast and energy-efficient Hamming decoder for software-defined radio using graphics processing units. J Supercomput 71, 2454–2472 (2015). https://doi.org/10.1007/s11227-015-1396-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-015-1396-x

Keywords

Navigation