Skip to main content

GPU-Based Low-Precision Detection Approach for Massive MIMO Systems

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13948))

Included in the following conference series:

Abstract

Massive Multiple-Input Multiple-Output (M-MIMO) uses hundreds of antennas in mobile communications base stations to increase the amount of transmitted data and the number of connected devices in 5G and beyond. However, M-MIMO systems increase the complexity of recovering the transmitted data (detection phase). To address this challenge, we leverage low-precision arithmetic in recent NVIDIA GPUs to improve the latency/scalability/accuracy of M-MIMO detection. We propose a GPU tree-based detection algorithm that aggregates multiple tree levels and formulates the computation as a matrix multiplication operation followed by a square-norm calculation and sorting (reduction) phase. This process is repeated until reaching the last level of the detection tree. The obtained results show near-optimal data detection with a 10\(\times \) speedup compared to a two-socket 28-core IceLake CPU implementation. We further deploy low-precision arithmetic operations. We show that moving from single-precision 32-bit floating-point arithmetic (FP32) to half-precision 16-bit representation (FP16) does not affect the accuracy performance while translating into an additional 1.7\(\times \) speedup. In addition, exploiting 8-bit integer representation results in an acceptable error rate degradation that can be compensated by increasing the number of aggregated levels. In addition, we propose a multi-GPU version that computes the matrix-multiplication operation of subsequent iterations in parallel. This latter operation represents more than 80% of the elapsed time for dense constellations. Results with four A100 GPUs show an additional 2.3\(\times \) relative speedup compared to our single GPU version. The achieved accuracy/scalability balance may accelerate the deployment of this technology and promote low-precision GPU computations within the wireless communication community.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrell, E., Eriksson, T., Vardy, A., Zeger, K.: Closest point search in lattices. IEEE Trans. Inf. Theory 48(8), 2201–2214 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  2. Alouini, M.S., Keyes, D.E., Ltaief, H., Dabah, A., Rezki, Z.: Massive multiple-input multiple-output system and method (14 Dec 2021). US Patent 11,201,645

    Google Scholar 

  3. Arfaoui, M.A., Ltaief, H., Rezki, Z., Alouini, M.S., Keyes, D.: Efficient sphere detector algorithm for massive MIMO using GPU hardware accelerator. Procedia Comput. Sci. 80, 2169–2180 (2016)

    Article  Google Scholar 

  4. Chen, T., Leib, H.: GPU acceleration for fixed complexity sphere decoder in large MIMO uplink systems. In: IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE 2015), pp. 771–777. IEEE (2015)

    Google Scholar 

  5. Dabah, A., Ltaief, H., Rezki, Z., Arfaoui, M.A., Alouini, M.S., Keyes, D.: Performance/complexity trade-offs of the sphere decoder algorithm for massive MIMO systems. arXiv preprint arXiv:2002.09561 (2020). To be submitted

  6. Fincke, U., Pohst, M.: Improved methods for calculating vectors of short length in a lattice, including a complexity analysis. Math. Comput. 44(170), 463–471 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  7. Foschini, G.J.: Layered space-time architecture for wireless communication in a fading environment when using multi-element antennas. Bell Labs Tech. J. 1(2), 41–59 (1996)

    Article  Google Scholar 

  8. Hassibi, B., Vikalo, H.: On the sphere-decoding algorithm I. expected complexity. IEEE Trans. Signal Process. 53(8), 2806–2818 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  9. Husmann, C., Georgis, G., Nikitopoulos, K., Jamieson, K.: FlexCore: massively parallel and flexible processing for large MIMO access points. In: 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2017), pp. 197–211 (2017)

    Google Scholar 

  10. Nikitopoulos, K., Georgis, G., Jayawardena, C., Chatzipanagiotis, D., Tafazolli, R.: Massively parallel tree search for high-dimensional sphere decoders. IEEE Trans. Parallel Distrib. Syst. 30(10), 2309–2325 (2018)

    Article  Google Scholar 

  11. Paulraj, A.J., Kailath, T.: Increasing capacity in wireless broadcast systems using distributed transmission/directional reception (DTDR) (6 Sep 1994). US Patent 5,345,599

    Google Scholar 

  12. Simon, M.K., Alouini, M.S.: Digital Communication over Fading Channels (Wiley Series in Telecommunications and Signal Processing), 2nd edn. Wiley-IEEE Press, New York (2004)

    Google Scholar 

  13. Sklar, B., et al.: Digital Communications, vol. 2. Prentice Hall, Upper Saddle River (2001)

    MATH  Google Scholar 

  14. Viterbo, E., Boutros, J.: A universal lattice code decoder for fading channels. IEEE Trans. Inf. Theory 45(5), 1639–1642 (1999)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adel Dabah .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dabah, A., Ltaief, H., Rezki, Z., Alouini, S., Keyes, D. (2023). GPU-Based Low-Precision Detection Approach for Massive MIMO Systems. In: Bhatele, A., Hammond, J., Baboulin, M., Kruse, C. (eds) High Performance Computing. ISC High Performance 2023. Lecture Notes in Computer Science, vol 13948. Springer, Cham. https://doi.org/10.1007/978-3-031-32041-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-32041-5_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-32040-8

  • Online ISBN: 978-3-031-32041-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics