Skip to main content

Advertisement

Log in

Implicit vs. Explicit Approximate Matrix Inversion for Wideband Massive MU-MIMO Data Detection

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Massive multi-user (MU) MIMO wireless technology promises improved spectral efficiency compared to that of traditional cellular systems. While data-detection algorithms that rely on linear equalization achieve near-optimal error-rate performance for massive MU-MIMO systems, they require the solution to large linear systems at high throughput and low latency, which results in excessively high receiver complexity. In this paper, we investigate a variety of exact and approximate equalization schemes that solve the system of linear equations either explicitly (requiring the computation of a matrix inverse) or implicitly (by directly computing the solution vector). We analyze the associated performance/complexity trade-offs, and we show that for small base-station (BS)-to-user-antenna ratios, exact and implicit data detection using the Cholesky decomposition achieves near-optimal performance at low complexity. In contrast, implicit data detection using approximate equalization methods results in the best trade-off for large BS-to-user-antenna ratios. By combining the advantages of exact, approximate, implicit, and explicit matrix inversion, we develop a new frequency-adaptive e qualizer (FADE), which outperforms existing data-detection methods in terms of performance and complexity for wideband massive MU-MIMO systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3

Similar content being viewed by others

Notes

  1. The methods proposed in this paper can easily be extended to other multi-carrier waveforms that support frequency-domain equalization [13], such as OFDM-based systems.

  2. Our results can be extended to user terminals with multiple antennas.

  3. A probabilistic convergence condition is provided in [6].

  4. We note that the parameter in [9] is \( \bar {\gamma }=({{B}}+{{U}})^{-1}\).

  5. The derivation of probabilistic convergence guarantees turns out to be non-trivial and is part of ongoing work.

  6. These set of base-points can either be pre-assigned or varied on-the-fly depending on channel condition for better performance.

  7. While the processing latency is another important design parameter in practice, it typically depends on (i) the data dependencies of the used algorithm, (ii) the hardware architecture (parallel or serial), and (iii) the computing fabric (e.g., GPU, FPGA, or ASIC). Hence, we limit our results on complexity aspects only—a detailed latency analysis would require hardware designs and is left for future work.

  8. More efficient matrix-multiplication algorithms, such as Strassen’s algorithm which scales with U 2.8074, could be used [47]; the irregularity of such algorithms, however, renders efficient hardware designs difficult.

  9. This initialization scheme can also be used by replacing D − 1 with an arbitrary matrix X that is close to the exact inverse A − 1.

References

  1. Rusek, F., Persson, D., Lau, B.K., Larsson, E.G., Marzetta, T.L., Edfors, O., Tufvesson, F. (2013). Scaling up MIMO: Opportunities and challenges with very large arrays. IEEE Signal Processing Magazine, 30(1), 40–60.

    Article  Google Scholar 

  2. Marzetta, T.L. (2010). Noncooperative cellular wireless with unlimited numbers of base station antennas. IEEE Transactions on Wireless Communications, 9(11), 3590–3600.

    Article  Google Scholar 

  3. Nam, Y.-H., Ng, B.L., Sayana, K., Li, Y., Zhang, J., Kim, Y., Lee, J. (2013). Full-dimension MIMO (FD-MIMO) for next generation cellular technology. IEEE Communications Magazine, 51(6), 172–179.

    Article  Google Scholar 

  4. Huh, H., Caire, G., Papadopoulos, H.C., Ramprashad, S.A. (2012). Achieving massive MIMO spectral efficiency with a not-so-large number of antennas. IEEE Transactions on Wireless Communications, 11(9), 3266–3239.

    Article  Google Scholar 

  5. Ngo, H.Q., Larsson, E.G., Marzetta, T.L. (2013). Energy and spectral efficiency of very large multiuser MIMO systems. IEEE Trans. on Commun., 61(4), 1436–1449.

    Article  Google Scholar 

  6. Wu, M., Yin, B., Wang, G., Dick, C., Cavallaro, J.R., Studer, C. (2014). Large-scale MIMO detection for 3GPP LTE: Algorithm and FPGA implementation. IEEE J. Sel. Topic of Sig. Proc., 8(5), 916–929.

    Article  Google Scholar 

  7. Prabhu, H., Rodrigues, J., Edfors, O., Rusek, F. (2013). Approximative matrix inverse computations for very-large MIMO and applications to linear pre-coding systems. In Proceedings IEEE WCNC (pp. 2710–2715).

  8. Dai, L., Gao, X., Su, X., Han, S., Chih-Lin, I., Wang, Z. (2015). Low-complexity soft-output signal detection based on gauss–Seidel method for uplink multiuser large-scale MIMO systems. IEEE Trans. on Vehicular Techn., 64(10), 4839–4845.

    Article  Google Scholar 

  9. Lu, Z., Ning, J., Zhang, Y., Xie, T., Shen, W. (2015). Richardson method based linear precoding with low complexity for massive MIMO systems. In Proceedings IEEE VTC.

  10. Yin, B., Wu, M., Cavallaro, J.R., Studer, C. (2014). Conjugate gradient-based soft-output detection and precoding in massive MIMO systems. In Proceedings IEEE GLOBECOM (pp. 3696–3701).

  11. Prabhu, H., Edfors, O., Rodrigues, J., Liu, L., Rusek, F. (2014). Hardware efficient approximative matrix inversion for linear pre-coding in massive MIMO. In IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 1700–1703).

  12. Yin, B., Wu, M., Wang, G., Dick, C., Cavallaro, J.R., Studer, C. (2014). A 3.8 Gb/s large-scale MIMO detector for 3GPP LTE-advanced. In Proceedings IEEE ICASSP.

  13. Tunali, N.E., Wu, M., Dick, C., Studer, C. (2015). Linear large-scale MIMO data detection for 5G multi-carrier waveform candidates. In 49th Asilomar Conference on Signals, Systems and Computers.

  14. Burg, A., Borgmann, M., Wenk, M., Zellweger, M., Fichtner, W., Bolcskei, H. (2005). VLSI Implementation of MIMO detection using the sphere decoding algorithm. IEEE Journal of solid-state circuits, 40 (7), 1566–1577.

    Article  Google Scholar 

  15. Studer, C., Burg, A., Bölcskei, H. (2008). Soft-output sphere decoding: algorithms and VLSI implementation. IEEE J. Sel. Areas Commun., 26(2), 290–300.

    Article  Google Scholar 

  16. Wong, K., Tsui, C., Cheng, R., Mow, W. (2002). A VLSI architecture of a K-best lattice decoding algorithm for MIMO channels. In Proc IEEE ISCAS, (Vol. 3 pp. 273–276).

  17. Yang, S., & Hanzo, L. (2015). Fifty years of MIMO detection: The road to large-scale MIMOs. IEEE Commun. Surveys & Tutorials, 17(4), 1941–1988.

    Article  Google Scholar 

  18. Wu, M., Dick, C., Cavallaro, J.R., Studer, C. (2014). Iterative detection and decoding in 3GPP LTE-based massive MIMO systems. In Proc EUSIPCO (pp. 96–100).

  19. Cirkic, M., & Larsson, E.G. (2014). SUMIS: Near-optimal soft-in soft-out MIMO detection with low and fixed complexity. IEEE Trans. on Sig. Proc., 62(12), 3084–3097.

    Article  MathSciNet  Google Scholar 

  20. Datta, T., Ashok Kumar, N., Chockalingam, A., Sundar Rajan, B. (2012). A novel MCMC algorithm for near-optimal detection in large-scale uplink mulituser MIMO systems. In Proc. IEEE ITA (pp. 69–77).

  21. Narasimhan, T.L., & Chockalingam, A. (2014). Channel hardening-exploiting message passing (CHEMP) receiver in large-scale MIMO systems. IEEE J. Sel. Topics of Sig. Proc., 8(5), 847–860.

    Article  Google Scholar 

  22. Jeon, C., Ghods, R., Maleki, A., Studer, C. (2015). Optimality of large MIMO detection via approximate message passing. In Proceedings IEEE ISIT (pp. 1227–1231).

  23. Ketonen, J., Karjalainen, J., Juntti, M., Hänninen, T. (2011). MIMO Detection in single carrier systems. In Proc EUSIPCO (pp. 654–658).

  24. Berardinelli, G., Manchon, C., Deneire, L., Sorensen, T., Mogensen, P., Pajukoski, K. (2009). Turbo receivers for single user MIMO LTE-a uplink. In Proc IEEE VTC (pp. 1–5).

  25. Okuyama, S., Takeda, K., Adachi, F. (2011). Iterative MMSE detection and interference cancellation for uplink SC-FDMA MIMO using HARQ. In Proc IEEE ICC (pp. 1–5).

  26. Studer, C., Fateh, S., Seethaler, D. (2011). ASIC Implementation of soft-input soft-output MIMO detection using MMSE parallel interference cancellation. IEEE Journal of Solid-State Circuits, 46(7), 1754–1765.

    Article  Google Scholar 

  27. Burg, A., Haene, S., Perels, D., Luethi, P., Felber, N., Fichtner, W. (2006). Algorithm and VLSI architecture for linear MMSE detection in MIMO-OFDM systems. In Proc IEEE ISCAS (pp. 4102–4105).

  28. Luethi, P., Burg, A., Haene, S., Perels, D., Felber, N., Fichtner, W. (2007). VLSI Implementation of a high-speed iterative sorted MMSE QR decomposition. In Proceedings IEEE ISCAS (pp. 1421–1424).

  29. Karkooti, M., Cavallaro, J.R., Dick, C. (2005). FPGA Implementation of matrix inversion using QRD-RLS algorithm. In Proceedings 44th Asilomar Conference on Signals, Systems and Computers (pp. 1625–1629).

  30. Bellis, S., Marnane, W., Fish, P. (1997). Alternative systolic array for non-square-root Cholesky decomposition. In IEEE Proceedings on Computers and Digital Techniques, (Vol. 144 pp. 57–64).

  31. Maslennikow, O., Lepekha, V., Sergiyenko, A., Tomas, A., Wyrzykowski, R. (2007). Parallel implementation of Cholesky LL T-algorithm in FPGA-based processor. In Parallel processing and applied mathematics. Springer (pp. 137–147).

  32. Yang, D., Peterson, G.D., Li, H., Sun, J. (2009). An FPGA implementation for solving least square problem. In 17th IEEE Symposium on Field Programmable Custom Computing Machines (pp. 303–306).

  33. Stewart, G. (1998). Matrix Algorithms. Basic decompositions.

  34. Schulz, G. (1933). Iterative Berechung der reziproken Matrix. Zeitschrift fü,r Angewandte Mathematik und Mechanik, 13(1), 57–59.

    Article  MATH  Google Scholar 

  35. Burg, A. (2006). VLSI Circuits for MIMO communication systems. ETH Zürich: Ph.D. dissertation.

    Google Scholar 

  36. Soleymani, F. (2012). A rapid numerical algorithm to compute matrix inversion International Journal of Mathematics and Mathematical Sciences.

  37. Altman, M. (1960). An optimum cubically convergent iterative method of inverting a linear bounded operator in Hilbert space. Pacific Journal of Mathematics, 10(4), 1107–1113.

    Article  MathSciNet  MATH  Google Scholar 

  38. Wu, M., Dick, C., Cavallaro, J.R., Studer, C. (2016). High-throughput data detection for massive MU-MIMO-OFDM using coordinate descent. IEEE Transactions on Circuits and Systems I: Regular Papers, 63(12), 2357–2367.

    Article  Google Scholar 

  39. Ben-Israel, A., & Cohen, D. (1966). On iterative computation of generalized inverses and associated projections. SIAM Journal on Numerical Analysis, 3(3), 410–419.

    Article  MathSciNet  MATH  Google Scholar 

  40. Tse, D., & Viswanath, P. (2005). Fundamentals of wireless communication. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  41. Li, Y., Cimini, Jr L.J., Sollenberger, N.R. (1998). Robust channel estimation for OFDM systems with rapid dispersive fading channels. IEEE Trans. on Commun., 46(7), 902–915.

    Article  Google Scholar 

  42. Sesia, S., Toufik, I., Baker, M. (2009). LTE, The UMTS Long Term Evolution: From Theory to Practice. New York: Wiley Publishing.

    Book  Google Scholar 

  43. Borgmann, M., & Bölcskei, H. (2004). Interpolation-based efficient matrix inversion for MIMO-OFDM receivers. In Proceedings 38th Asilomar Conference on Signals, Systems and Computers, (Vol. 2 pp. 1941–1947).

  44. Cescato, D., & Bölcskei, H. (2010). QR Decomposition of Laurent polynomial matrices sampled on the unit circle. IEEE Trans. on Info. Theory, 56(9), 4754–4761.

    Article  MathSciNet  MATH  Google Scholar 

  45. Jeon, C., Li, Z., Studer, C. (2017). Approximate Gram-matrix interpolation for wideband massive MU-MIMO systems. arXiv:1610.00227.

  46. Farhang, A., Marchetti, N., Doyle, L.E., Farhang-Boroujeny, B. (2014). Filter bank multicarrier for massive MIMO. In Proceedings on 80th IEEE Vehicular Technology Conference (pp. 1–7).

  47. Strassen, V. (1969). Gaussian elimination is not optimal. Numerische Mathematik, 13(4), 354–356.

    Article  MathSciNet  MATH  Google Scholar 

  48. 3rd Generation Partnership Project. (2013). Technical Specification Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); Physical Layer Procedures (Release 10). 3GPP Organizational Partners TS 36.213 version 10.10.0.

  49. Hentilä, L., Kyösti, P., Käske, M., Narandzic, M., Alatossava, M. Matlab implementation of the WINNER phase II channel model ver 1.1, Dec. 2007. [Online]. Available. https://www.ist-winner.org/phase_2_model.html.

  50. Studer, C., & Bölcskei, H. (2010). Soft–input soft–output single tree-search sphere decoding. IEEE Transactions on Information Theory, 56(10), 4827–4842.

    Article  MathSciNet  MATH  Google Scholar 

  51. Li, K., Sharan, R., Chen, Y., Goldstein, T., Cavallaro, J. R., Studer, C. (2017). Decentralized Baseband Processing for Massive MU-MIMO Systems. In IEEE J. Emerging and Sel. Topics in Circuits and Systems (JETCAS), to appear in.

  52. Li, K., Yin, B., Wu, M., Cavallaro, J.R., Studer, C. (2015). Accelerating massive MIMO uplink detection on GPU for SDR systems, in. In IEEE Dallas Circuits and Systems Conference (DCAS), Oct, (Vol. 2015 pp. 1–4).

Download references

Acknowledgements

The work of MW, BY, KL and JRC was supported by Xilinx Inc. and by the US National Science Foundation (NSF) under grants CNS-1717218, ECCS-1408370, CNS-1265332, and ECCS-1232274. The work of CS was supported by Xilinx Inc. and by the US NSF under grants ECCS-1408006, CCF-1535897, CAREER CCF-1652065, and CNS-1717559.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Wu.

Appendix A: Proofs and Derivations

Appendix A: Proofs and Derivations

1.1 A.1 Proof of Lemma 1

We use [33, Thm. 4.20], which establishes that for a given matrix \({\mathbf {P}}\in \mathbb {C}^{{{U}}\times {{U}}}\) for which \(\lim _{k\to \infty }{\mathbf {P}}^{k}={\mathbf {0}}_{{{U}}\times {{U}}}\), we have \(({\mathbf {I}}_{{{U}}}-{\mathbf {P}})^{-1}=\sum\limits_{k = 0}^{\infty } {\mathbf {P}}^{k}\) and I U P is invertible. As a consequence, by defining \(\widetilde {{\mathbf {A}}}_{0}^{-1}{\mathbf {A}}={\mathbf {I}}_{{{U}}}-{\mathbf {P}}\) and assuming that \(\lim _{k\to \infty } ({\mathbf {I}}_{{{U}}}-\widetilde {{\mathbf {A}}}^{-1}_{0}{\mathbf {A}})^{k}={\mathbf {0}}_{{{U}}\times {{U}}}\), we have

$$ {\mathbf{A}}^{-1}\widetilde{{\mathbf{A}}}_{0} = (\widetilde{{\mathbf{A}}}_{0}^{-1}{\mathbf{A}})^{-1} =\sum\limits_{k = 0}^{\infty} ({\mathbf{I}}_{{{U}}}-\widetilde{{\mathbf{A}}}_{0}^{-1}{\mathbf{A}})^{k}, $$

which can be rewritten to the accelerated Neumann series in Eq. 4, since \(\widetilde {{\mathbf {A}}}^{-1}\) was assumed to be full rank.

1.2 A.2 Proof of Lemma 4

We start by rewriting the residual error term as a function of \({\mathbf {A}}_{0}^{-1}\). We have the following identities:

$$\begin{array}{@{}rcl@{}} \mathbf{e}_{K_{\max}} &=& \tilde{{\mathbf{y}}}_{K_{\max}} - {\mathbf{A}}^{-1}{\mathbf{y}}^{\text{MF}} = (\widetilde{{\mathbf{A}}}^{-1}_{K_{\max}}- {\mathbf{A}}^{-1})\mathbf{y}^{\text{MF}}\\ &=& \left( -\sum\limits_{k=K_{\max}+ 1}^{\infty}({\mathbf{I}}_{{{U}}}-\widetilde{{\mathbf{A}}}^{-1}_{0}{\mathbf{A}})^{k}\widetilde{{\mathbf{A}}}^{-1}_{0}\right)\!{\mathbf{y}}^{\text{MF}}\\ &=& -({\mathbf{I}}_{{{U}}}-\widetilde{{\mathbf{A}}}^{-1}_{0}{\mathbf{A}})^{K_{\max}+ 1}{\mathbf{A}}^{-1}{\mathbf{y}}^{\text{MF}}. \end{array} $$

By using basic properties of induced norms, we get the following inequality:

$$\|\mathbf{e}_{K_{\max}}\| \leq \|{\mathbf{I}}_{{{U}}}-\widetilde{{\mathbf{A}}}_{0}^{-1}{\mathbf{A}}\|^{K_{\max}+ 1}\|\tilde{{\mathbf{y}}}\|, $$

where we define \(\tilde {{\mathbf {y}}} = {\mathbf {A}}^{-1}{\mathbf {y}}^{\text {MF}}\).

1.3 A.3 Derivation of Initialization 1

We start by noting that squaring both sides in Eq. 5 results in the equivalent sufficient condition \(\|{\mathbf {I}}_{{{U}}}-\widetilde {{\mathbf {A}}}^{-1}_{0}{\mathbf {A}}\|^{2}<1\). Furthermore, by assuming the spectral norm (which is a consistent norm), we have \(\|{\mathbf {I}}_{{{U}}}-\widetilde {{\mathbf {A}}}^{-1}_{0}{\mathbf {A}}\|^{2}\leq \|{\mathbf {I}}_{{{U}}}-\widetilde {{\mathbf {A}}}^{-1}_{0}{\mathbf {A}}\|^{2}_{F}\), which enables us to obtain a more restrictive sufficient condition that allows the design of efficient initializers:

$$ \|{\mathbf{I}}_{{{U}}}-\widetilde{{\mathbf{A}}}_{0}^{-1}{\mathbf{A}}\|^{2}_{F}<1. $$
(17)

The initialization method developed nextFootnote 9 is of the form \(\widetilde {{\mathbf {A}}}^{-1}_{0} = {\mathbf {S}}{\mathbf {D}}^{-1}\), where S is a diagonal scaling matrix that is designed to meet condition (17).

Let W contain the diagonal part of D − 1 A and Q the off-diagonal part. We define

$$\begin{array}{@{}rcl@{}} f &=& \|{\mathbf{I}}_{{{U}}}-{\mathbf{S}}{\mathbf{D}}^{-1}{\mathbf{A}}\|_{F}^{2} = \|{\mathbf{I}}_{{{U}}}-{\mathbf{S}}({\mathbf{W}}+{\mathbf{Q}})\|_{F}^{2} \\ & =& \|{\mathbf{I}}_{{{U}}}-{\mathbf{S}}{\mathbf{W}}\|^{2}_{F}+\|{\mathbf{S}}{\mathbf{Q}}\|_{F}^{2}, \end{array} $$
(18)

and seek a diagonal scaling matrix S that minimizes f.

We define the diagonal scaling matrix to have the form S = α I, which leads to \(f = {\sum }_{i = 1}^{{{U}}}\left |1-\alpha \,W_{i,i}\right |^{2}+|\alpha |^{2}\|{\mathbf {Q}}\|_{F}^{2}\). We now find the optimum scaling parameter α opt by computing f/ α = 0and solving for α. Standard manipulations yield

$$\alpha^{\text{opt}}= \left( \sum\limits_{i = 1}^{{{U}}}{W_{i,i}^{*}}\right)\|{\mathbf{D}}^{-1}{\mathbf{A}}\|^{-2}_{F}, $$

where \(W_{i,i}^{*}\) are the complex conjugates of the diagonal entries of W. Since W = I U , we get \({\alpha ^{\text {opt}}= {{U}}\|{\mathbf {D}}^{-1}{\mathbf {A}}\|^{-2}_{F}}\). Consequently, the first initialization matrix is \(\widetilde {{\mathbf {A}}}_{0}^{-1} = \alpha ^{\text {opt}}{\mathbf {D}}^{-1}\).

1.4 A.4 Derivation of Initialization 2

For the second initialization method, we follow the derivation in Appendix A.3 but with a more general diagonal scaling matrix of the form S = diag(α 1,…,α U ). We obtain

$$f = \sum\limits_{i = 1}^{{{U}}}\left|1-\alpha_{i}\,W_{i,i}\right|^{2}+|\alpha_{i}|^{2}\|{\mathbf{r}}_{i}\|_{2}^{2}, $$

where r i corresponds to the i th row of Q. To find the optimal scaling parameters α i , i = 1,…,U, we set \({\partial f_{i}}/{\partial \alpha _{i}^{*}} = 0\) and solve for α i . Standard manipulations yield

$$\alpha_{i}^{\text{opt}}=W_{i,i}^{*}/(|W_{i,i}|^{2}+\|{\mathbf{r}}_{i}\|^{2}_{2}), \,\, i = 1,\ldots,{{U}}, $$

and we use the fact that W i,i = 1, ∀i.

Consequently, the second initialization matrix is

$$\widetilde{{\mathbf{A}}}_{0}^{-1} = \text{diag}(\alpha_{1}^{\text{opt}},\ldots,\alpha_{{{U}}}^{\text{opt}}){\mathbf{D}}^{-1}. $$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, M., Yin, B., Li, K. et al. Implicit vs. Explicit Approximate Matrix Inversion for Wideband Massive MU-MIMO Data Detection. J Sign Process Syst 90, 1311–1328 (2018). https://doi.org/10.1007/s11265-017-1313-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-017-1313-z

Keywords

Navigation