GPU-Based Low-Precision Detection Approach for Massive MIMO Systems

Dabah, Adel; Ltaief, Hatem; Rezki, Zouheir; Alouini, Slim; Keyes, David

doi:10.1007/978-3-031-32041-5_8

Adel Dabah¹¹,
Hatem Ltaief¹¹,
Zouheir Rezki¹²,
Slim Alouini¹¹ &
…
David Keyes¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13948))

Included in the following conference series:

International Conference on High Performance Computing

1052 Accesses
1 Citations

Abstract

Massive Multiple-Input Multiple-Output (M-MIMO) uses hundreds of antennas in mobile communications base stations to increase the amount of transmitted data and the number of connected devices in 5G and beyond. However, M-MIMO systems increase the complexity of recovering the transmitted data (detection phase). To address this challenge, we leverage low-precision arithmetic in recent NVIDIA GPUs to improve the latency/scalability/accuracy of M-MIMO detection. We propose a GPU tree-based detection algorithm that aggregates multiple tree levels and formulates the computation as a matrix multiplication operation followed by a square-norm calculation and sorting (reduction) phase. This process is repeated until reaching the last level of the detection tree. The obtained results show near-optimal data detection with a 10\(\times \) speedup compared to a two-socket 28-core IceLake CPU implementation. We further deploy low-precision arithmetic operations. We show that moving from single-precision 32-bit floating-point arithmetic (FP32) to half-precision 16-bit representation (FP16) does not affect the accuracy performance while translating into an additional 1.7\(\times \) speedup. In addition, exploiting 8-bit integer representation results in an acceptable error rate degradation that can be compensated by increasing the number of aggregated levels. In addition, we propose a multi-GPU version that computes the matrix-multiplication operation of subsequent iterations in parallel. This latter operation represents more than 80% of the elapsed time for dense constellations. Results with four A100 GPUs show an additional 2.3\(\times \) relative speedup compared to our single GPU version. The achieved accuracy/scalability balance may accelerate the deployment of this technology and promote low-precision GPU computations within the wireless communication community.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrell, E., Eriksson, T., Vardy, A., Zeger, K.: Closest point search in lattices. IEEE Trans. Inf. Theory 48(8), 2201–2214 (2002)
Article MathSciNet MATH Google Scholar
Alouini, M.S., Keyes, D.E., Ltaief, H., Dabah, A., Rezki, Z.: Massive multiple-input multiple-output system and method (14 Dec 2021). US Patent 11,201,645
Google Scholar
Arfaoui, M.A., Ltaief, H., Rezki, Z., Alouini, M.S., Keyes, D.: Efficient sphere detector algorithm for massive MIMO using GPU hardware accelerator. Procedia Comput. Sci. 80, 2169–2180 (2016)
Article Google Scholar
Chen, T., Leib, H.: GPU acceleration for fixed complexity sphere decoder in large MIMO uplink systems. In: IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE 2015), pp. 771–777. IEEE (2015)
Google Scholar
Dabah, A., Ltaief, H., Rezki, Z., Arfaoui, M.A., Alouini, M.S., Keyes, D.: Performance/complexity trade-offs of the sphere decoder algorithm for massive MIMO systems. arXiv preprint arXiv:2002.09561 (2020). To be submitted
Fincke, U., Pohst, M.: Improved methods for calculating vectors of short length in a lattice, including a complexity analysis. Math. Comput. 44(170), 463–471 (1985)
Article MathSciNet MATH Google Scholar
Foschini, G.J.: Layered space-time architecture for wireless communication in a fading environment when using multi-element antennas. Bell Labs Tech. J. 1(2), 41–59 (1996)
Article Google Scholar
Hassibi, B., Vikalo, H.: On the sphere-decoding algorithm I. expected complexity. IEEE Trans. Signal Process. 53(8), 2806–2818 (2005)
Article MathSciNet MATH Google Scholar
Husmann, C., Georgis, G., Nikitopoulos, K., Jamieson, K.: FlexCore: massively parallel and flexible processing for large MIMO access points. In: 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2017), pp. 197–211 (2017)
Google Scholar
Nikitopoulos, K., Georgis, G., Jayawardena, C., Chatzipanagiotis, D., Tafazolli, R.: Massively parallel tree search for high-dimensional sphere decoders. IEEE Trans. Parallel Distrib. Syst. 30(10), 2309–2325 (2018)
Article Google Scholar
Paulraj, A.J., Kailath, T.: Increasing capacity in wireless broadcast systems using distributed transmission/directional reception (DTDR) (6 Sep 1994). US Patent 5,345,599
Google Scholar
Simon, M.K., Alouini, M.S.: Digital Communication over Fading Channels (Wiley Series in Telecommunications and Signal Processing), 2nd edn. Wiley-IEEE Press, New York (2004)
Google Scholar
Sklar, B., et al.: Digital Communications, vol. 2. Prentice Hall, Upper Saddle River (2001)
MATH Google Scholar
Viterbo, E., Boutros, J.: A universal lattice code decoder for fading channels. IEEE Trans. Inf. Theory 45(5), 1639–1642 (1999)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Division of Computer, Electrical, and Mathematical Sciences and Engineering, King Abdullah University of Science and Technology, Thuwal, Jeddah, 23955, Saudi Arabia
Adel Dabah, Hatem Ltaief, Slim Alouini & David Keyes
University of California Santa Cruz, 1156 High Street, Santa Cruz, CA, 95064, USA
Zouheir Rezki

Authors

Adel Dabah
View author publications
You can also search for this author in PubMed Google Scholar
Hatem Ltaief
View author publications
You can also search for this author in PubMed Google Scholar
Zouheir Rezki
View author publications
You can also search for this author in PubMed Google Scholar
Slim Alouini
View author publications
You can also search for this author in PubMed Google Scholar
David Keyes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adel Dabah .

Editor information

Editors and Affiliations

University of Maryland, College Park, MD, USA
Abhinav Bhatele
NVIDIA, Helsinki, Finland
Jeff Hammond
Université Paris-Saclay, Gif-sur-Yvette, France
Marc Baboulin
CERFACS, Toulouse, France
Carola Kruse

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dabah, A., Ltaief, H., Rezki, Z., Alouini, S., Keyes, D. (2023). GPU-Based Low-Precision Detection Approach for Massive MIMO Systems. In: Bhatele, A., Hammond, J., Baboulin, M., Kruse, C. (eds) High Performance Computing. ISC High Performance 2023. Lecture Notes in Computer Science, vol 13948. Springer, Cham. https://doi.org/10.1007/978-3-031-32041-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-32041-5_8
Published: 10 May 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-32040-8
Online ISBN: 978-3-031-32041-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics