Deep Unfolded Extended Conjugate Gradient Method for Massive MIMO Processing with Application to Reciprocity Calibration

Sirois, Samuel; Ahmed Ouameur, Messaoud; Massicotte, Daniel

doi:10.1007/s11265-020-01631-1

Deep Unfolded Extended Conjugate Gradient Method for Massive MIMO Processing with Application to Reciprocity Calibration

Published: 06 February 2021

Volume 93, pages 965–975, (2021)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Samuel Sirois^1,2,3,
Messaoud Ahmed Ouameur^1,2,3 &
Daniel Massicotte ORCID: orcid.org/0000-0002-7807-7919^1,2,3

234 Accesses
Explore all metrics

Abstract

In this paper, we consider deep unfolding the standard iterative conjugate gradient (CG) algorithm to solve a linear system of equations. Instead of being adjusted with known rules, the parameters are learned via backpropagation to yield the optimal results. However, the proposed unfolded CG (UCG) is extended wherein a scalar parameter is substituted by a matrix-parameter to augment the degrees of freedom per layer. Once the training is completed, the UCG has revealed to require far a smaller number of layers than the number of iterations needed using the standard iterative CG. It is also shown to be very robust to noise and outperforms the standard CG in low signal to noise ratio (SNR) region. A key merit of the proposed approach is the fact that no explicit training data is dedicated to the learning phase as the optimization process relies on the residual error which is not explicitly expressed as a function of the desired data. As an example, the proposed UCG is applied to solve the reciprocity calibration problem encountered in massive MIMO (Multiple-Input Multiple-Output) systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised Deep Learning-Based Hybrid Beamforming in Massive MISO Systems

Hybrid Precoding Schemes for mmWave Massive MIMO Systems—A Comprehensive Survey

Deep Ridge Regression Neural Network-based hybrid precoder and combiner design

Article 21 February 2024

References

Gentle, J. E. (2007). Matrix algebra: Theory, computations and applications in statistics, Springer, 978-0-387-70872-0.
O’Shea, T., & Hoydis, J. (2017). An introduction to deep learning for the physical layer. IEEE Transactions on Cognitive Communications and Networking, 3(4), 563–575.
Article Google Scholar
Samuel, N., Diskin, T. and Wiesel, A. (2017). Deep MIMO detection. 2017 IEEE 18th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Sapporo, pp. 1–5.
O’Shea, T. J., Corgan, J., and Clancy, T. C. (2016). Unsupervised representation learning of structured radio communication signals. In: Proc IEEE Int Workshop Sensing, Processing and Learning for Intelligent Machines (SPLINE), pp. 1–5.
Gruber, T., Cammerer, S., Hoydis, J., and ten Brink, S. (2017). On deep learning based channel decoding. In: Proc IEEE 51st Annu Conf Inf Sciences Syst (CISS), pp. 1–6.
Schenk, T. (2008). RF imperfections in high-rate wireless systems: Impact and digital compensation. Springer Science & Business Media.
Chen, Y.-H., Krishna, T., Emer, J. S., & Sze, V. (2017). Eyeriss: An energy efficient reconfigurable accelerator for deep convolutional neural networks. IEEE Journal of Solid-State Circuits, 52(1), 127–138.
Article Google Scholar
Raj, V., & Kalyani, S. (2018). Backpropagating through the air: Deep learning at physical layer without channel models. IEEE Communications Letters, 22(11), 2278–2281.
Article Google Scholar
Ye, H., Li, G. Y., & Juang, B. (2018). Power of deep learning for channel estimation and signal detection in OFDM systems. IEEE Wireless Communications Letters, 7(1), 114–117.
Article Google Scholar
Borgerding, M., and Schniter, P. (2016). Onsager-corrected deep learning for sparse linear inverse problems. 2016 IEEE global conference on signal and information processing (GlobalSIP). IEEE.
Neev, S., Diskin, T., & Wiesel, A. (2019). Learning to detect. IEEE Transactions on Signal Processing, 67(10), 2554–2564.
Article MathSciNet Google Scholar
Boccardi, F., Heath, R. W., Lozano, A., Marzetta, T. L., & Popovski, P. (2014). Five disruptive technology directions for 5G. IEEE Communications Magazine, 52(2), 74–80.
Article Google Scholar
Marzetta, T. L. (2010). Noncooperative cellular wireless with unlimited numbers of base station antennas. IEEE Transactions on Wireless Communications, 9(11), 3590–3600.
Article Google Scholar
Hoydis, J., Hosseini, K., Brink, S. T., & Debbah, M. (2013). Making Smart Use of Excess Antennas: Massive MIMO, Small Cells, and TDD. The Bell Labs Technical Journal, 18(2), 5–21.
Article Google Scholar
Ngo, H. Q. (2015). Massive MIMO: Fundamentals and System Designs, Ph.D. Thesis, Linköping University Electronic Press.
Shariati, N., Björnson, E., Bengtsson, M., & Debbah, M. (2014). Low-complexity Polynomial Channel estimation in large-scale MIMO with arbitrary statistics. IEEE Journal of Selected Topics in Signal Processing, 8(5), 815–830.
Article Google Scholar
Yin, H., Gesbert, D., Filippou, M., & Liu, Y. (2013). A coordinated approach to channel estimation in large-scale multiple-antenna systems. IEEE Journal on Selected Areas in Communications, 31(2), 264–273.
Article Google Scholar
Bourdoux, A. and Van der Perre, L. (2015). Analysis of non-reciprocity impact and possible solutions. Technical report, MAMMOET project, ref. no. ICT-619086/D2.4.
Vieira, J., Rusek, F., and Tufvesson, F. (2014). Reciprocity calibration methods for massive MIMO based on antenna coupling. 2014 IEEE global communications conference, Austin, TX, pp. 3708–3712.
Kaltenberger, F., Jiang, H., Guillaud, M., and Knopp, R. (2010). Relative channel reciprocity calibration in MIMO/TDD systems. In: Future Network and Mobile Summit, 2010, pp. 1–10.
Shepard, C., Yu, H., Anand, N., Li, E., Marzetta, T., Yang, R., and Zhong, L. (2012) Argos: Practical many-antenna base stations. In: Proceedings of the 18th Annual International Conference on Mobile Computing and Networking, ser. Mobicom ‘12. New York, NY, USA: ACM, pp. 53–64.
Björnson, E., Larsson, E. G., & Marzetta, T. L. (2016). Massive MIMO: Ten myths and one critical question. IEEE Communications Magazine, 54(2), 114–123.
Article Google Scholar
Rogalin, R., Bursalioglu, O. Y., Papadopoulos, H., Caire, G., Molisch, A. F., Michaloliakos, A., Balan, V., & Psounis, K. (2014). Scalable synchronization and reciprocity calibration for distributed multiuser MIMO. IEEE Transactions on Wireless Communications, 13(4), 1815–1831.
Article Google Scholar
Ouameur, M. A, and Massicotte, D. (2019). Successive column-wise matrix inversion update for large scale massive MIMO reciprocity calibration, Accepted in IEEE wireless communications and networking conference (WCNC), Marrakech, Morocco.
Vieira, J., Rusek, F., Edfors, O., Malkowsky, S., Liu, L., & Tufvesson, F. (2017). Reciprocity calibration for massive MIMO: Proposal, Modeling, and validation. IEEE Transactions on Wireless Communications, 16(5), 3042–3056.
Article Google Scholar
Arun, K. S. (1992). A unitary constrained total least squares problem in signal processing. SIAM Journal on Matrix Analysis and Applications, 13(3), 728–745.
Article MathSciNet Google Scholar
Cichocki, A., & Unbehauen, R. (1994). Simplified neural networks for solving linear least squares and total least squares problems in real time. IEEE Transactions on Neural Networks, 5(6), 910–923.
Article Google Scholar

Download references

Acknowledgments

This work has been funded by the Natural Sciences and Engineering Research Council of Canada, Prompt and NUTAQ Innovation, and the CMC Microsystems for equipments. This work was supported by the Chaire de recherche sur les signaux et l’intelligence de systems haute performance for technical support.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Université du Québec à Trois-Rivières, 3351, Boul. des Forges, Trois-Rivières, Québec, Canada
Samuel Sirois, Messaoud Ahmed Ouameur & Daniel Massicotte
Laboratoire des Signaux et Systèmes Intégrés, Université du Québec à Trois-Rivières, Trois-Rivières, Canada
Samuel Sirois, Messaoud Ahmed Ouameur & Daniel Massicotte
Chaire de recherche sur les signaux et l’intelligence de systems haute performance, Université du Québec à Trois-Rivières, Trois-Rivières, Canada
Samuel Sirois, Messaoud Ahmed Ouameur & Daniel Massicotte

Authors

Samuel Sirois
View author publications
You can also search for this author in PubMed Google Scholar
Messaoud Ahmed Ouameur
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Massicotte
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Massicotte.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1

The reader must refer to the dependence tree shown in Fig. 1 and to Eqs. (19), (20) and (21) to understand the following derivations. The explicit forms of the gradient of the loss function e with respect to α_k and λ_kare

$$ \overline{\frac{\partial e}{\partial {\alpha}_k}}=2{r}_{n+1}^T\overline{\frac{\partial {r}_{n+1}}{\partial {\alpha}_k}} $$

(A.1)

$$ \overline{\frac{\partial e}{\partial {\lambda}_k}}=2\sum \limits_{z=1}^{2M}{r}_{n+1}^{(z)}\overline{\frac{\partial {r}_{n+1}^{(z)}}{\partial {\lambda}_k}} $$

(A.2)

where $ \overline{\frac{\partial \left(\cdot \right)}{\partial \left(\cdot \right)}} $ indicates the total derivative and $ \overline{\frac{\partial {r}_{n+1}^{(z)}}{\partial {\lambda}_k}} $ represents the total derivative of the z^th element of r_n + 1 with respect to λ_k where k ∈ [1 : n] for Eq. (A.1) and k ∈ [1 : n − 1] for Eq. (A.2). The formulas for the total derivative of r_n + 1 with respect to α_k and λ_k use

$$ \overline{\frac{\partial {r}_{n+1}}{\partial {\alpha}_n}}=\frac{\partial {r}_{n+1}}{\partial {\alpha}_n} $$

(A.3)

where the total derivative of r_n + 1 with respect to α_n corresponds to the only possible path on Fig. 1 that goes from r_n + 1 to α_n. Before going further, it is important to notice that the allowed paths are those following the same direction as the arrows in Fig. 1. Therefore, we have

$$ \overline{\frac{\partial {r}_{n+1}}{\partial {\alpha}_k}}=\overline{\frac{\partial {r}_{n+1}}{\partial {r}_{k+1}}}\frac{\partial {r}_{k+1}}{\partial {\alpha}_k} $$

(A.4)

where the chain rule is applied between the total derivative of r_n + 1 with respect to r_k + 1, which depends on every possible path on Fig. 1 between r_n + 1 and r_k + 1, and the straight unique path derivative between r_k + 1 and α_k with k ∈ [1 : n − 1]. In a similar way, we derive

$$ \overline{\frac{\partial {r}_{n+1}^{(z)}}{\partial {\lambda}_k}}\left(i,j\right)=\overline{\frac{\partial {r}_{n+1}}{\partial {p}_{k+1}}}\left(i,z\right)\frac{\partial {p}_{k+1}}{\partial {\lambda}_k}\left(i,j\right) $$

(A.5)

where the two indices in parenthesis next to every term represent the row and the column of the resulting matrix with i, j, z ∈ [1 : 2M] and k ∈ [1 : n − 1]. The left-hand side of Eq. (A.5) shows that there is a distinct derivative for each element in the vector r_n + 1 with respect to each (i, j) element in the λ_k matrix. The reason why the form of Eq. (A.5) is slightly different from the one of Eq. (A.4) is due to the fact that λ_k is a matrix instead of being a scalar. We then continue with the total derivative of r_n + 1 with respect to p_n

$$ \overline{\frac{\partial {r}_{n+1}}{\partial {p}_n}}=\frac{\partial {r}_{n+1}}{\partial {p}_n} $$

(A.6)

which is simply the direct path in Fig. 1 between r_n + 1 and p_n. Things get slightly more complicated for the general form of the derivative of r_n + 1 with respect to p_k with k ∈ [2 : n − 1]

$$ \overline{\frac{\partial {r}_{n+1}}{\partial {p}_k}}=\overline{\frac{\partial {r}_{n+1}}{\partial {p}_{k+1}}}\left(\frac{\partial {p}_{k+1}}{\partial {p}_k}+\frac{\partial {p}_{k+1}}{\partial {s}_k}\frac{\partial {s}_k}{\partial {r}_{k+1}}\frac{\partial {r}_{k+1}}{\partial {p}_k}\right)+\overline{\frac{\partial {r}_{n+1}}{\partial {r}_{k+1}}}\frac{\partial {r}_{k+1}}{\partial {p}_k} $$

(A.7)

where the chain rule is first applied between the total derivative of r_n + 1 with respect to p_k + 1, which depends on every possible path on Fig. 1 between these two nodes, and the straight path derivative between p_k + 1 and p_k. In a similar way, the chain rule is applied between the total derivative of r_n + 1 with respect to p_k + 1 and the straight paths derivatives going through p_k + 1, s_k, r_k + 1 and p_k respectively. Finally, the same process is done between the total derivative of r_n + 1 with respect to r_k + 1 and the unique straight path derivative between r_k + 1 and p_k. At that point, the only unknown we are left with is the total derivative of r_n + 1 with respect to any other r. We begin with

$$ \overline{\frac{\partial {r}_{n+1}}{\partial {r}_n}}=\frac{\partial {r}_{n+1}}{\partial {r}_n}+\frac{\partial {r}_{n+1}}{\partial {p}_n}\frac{\partial {p}_n}{\partial {s}_{n-1}}\frac{\partial {s}_{n-1}}{\partial {r}_n} $$

(A.8)

where the first possible path is the direct one between r_n + 1 and r_n and the second possible path is the one that goes through r_n + 1, p_n, s_n − 1 and r_n respectively. We then continue with

$$ {\displaystyle \begin{array}{c}\overline{\frac{\partial {r}_{n+1}}{\partial {r}_{n-1}}}=\overline{\frac{\partial {r}_{n+1}}{\partial {r}_n}}\left(\frac{\partial {r}_n}{\partial {r}_{n-1}}+\frac{\partial {r}_n}{\partial {p}_{n-1}}\frac{\partial {p}_{n-1}}{\partial {s}_{n-2}}\frac{\partial {s}_{n-2}}{\partial {r}_{n-1}}\right)\\ {}\kern2.36em +\frac{\partial {r}_{n+1}}{\partial {p}_n}\frac{\partial {p}_n}{\partial {p}_{n-1}}\frac{\partial {p}_{n-1}}{\partial {s}_{n-2}}\frac{\partial {s}_{n-2}}{\partial {r}_{n-1}}\end{array}} $$

(A.9)

where the three possible paths between r_n + 1 and r_n − 1 are considered. Finally, we have the general case for k ∈ [2 : n − 2]

$$ {\displaystyle \begin{array}{c}\overline{\frac{\partial {r}_{n+1}}{\partial {r}_k}}=\overline{\frac{\partial {r}_{n+1}}{\partial {r}_{k+1}}}\left(\frac{\partial {r}_{k+1}}{\partial {r}_k}+\frac{\partial {r}_{k+1}}{\partial {p}_k}\frac{\partial {p}_k}{\partial {s}_{k-1}}\frac{\partial {s}_{k-1}}{\partial {r}_k}\right)\\ {}\kern2.24em +\frac{\partial {r}_{n+1}}{\partial {p}_n}\left(\prod \limits_{i=n}^{k+1}\frac{\partial {p}_i}{\partial {p}_{i-1}}\right)\frac{\partial {p}_k}{\partial {s}_{k-1}}\frac{\partial {s}_{k-1}}{\partial {r}_k}\\ {}\kern2.24em +\sum \limits_{j=1}^{n-k-1}\left(\overline{\frac{\partial {r}_{n+1}}{\partial {r}_{n-j+1}}}\frac{\partial {r}_{n-j+1}}{\partial {p}_{n-j}}\prod \limits_{z=n-j}^{k+1}\left(\frac{\partial {p}_z}{\partial {p}_{z-1}}\right)\right)\frac{\partial {p}_k}{\partial {s}_{k-1}}\frac{\partial {s}_{k-1}}{\partial {r}_k}\end{array}} $$

(A.10)

where all possible paths going from r_n + 1 to r_kare selected. By differentiating Eqs. (19), (20) and (21), every symbolic derivative can be replaced by its true value

$$ \Big\{{\displaystyle \begin{array}{c}\frac{\partial {r}_{k+1}}{\partial {r}_k}=I,\frac{\partial {r}_{k+1}}{\partial {p}_k}=-{\alpha}_kA\\ {}\frac{\partial {r}_{k+1}}{\partial {\alpha}_k}=-A{p}_k,\frac{\partial {s}_{k-1}}{\partial {r}_k}=A\\ {}\frac{\partial {p}_{k+1}}{\partial {p}_k}={\lambda}_k,\frac{\partial {p}_{k+1}}{\partial {s}_k}=I\\ {}\frac{\partial {p}_{k+1}}{\partial {\lambda}_k}=\left(\begin{array}{ccc}{p}_k^{(1)}& \dots & {p}_k^{(2M)}\\ {}\vdots & \vdots & \vdots \\ {}{p}_k^{(1)}& \cdots & {p}_k^{(2M)}\end{array}\right)\in {\mathbb{R}}^{2M\times 2M}\end{array}} $$

(A.11)

where I is a 2M × 2M identity matrix and $ \frac{\partial {\mathbf{p}}_{k+1}}{\partial {\boldsymbol{\uplambda}}_k} $ is a matrix with the i^th column having a constant value corresponding to the i^th element of p_k.

To get better and faster results during the backpropagation process, the adaptation step for α_k and λ_k can be set to different values and a momentum technique is used on λ_k:

$$ {\lambda}_k\left(t+1\right)={\lambda}_k(t)-{\mu}_1\overline{\frac{\partial e}{\partial {\lambda}_k(t)}}+\beta \left({\lambda}_k(t)-{\lambda}_k\left(t-1\right)\right) $$

(A.12)

$$ {\alpha}_k\left(t+1\right)={\alpha}_k(t)-{\mu}_2\overline{\frac{\partial e}{\partial {\alpha}_k(t)}} $$

(A.13)

where μ₁ and μ₂ are the fixed adaptation step, β is the momentum coefficient between 0 and 1 and t represents the current iteration step (epoch) in the backpropagation process.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sirois, S., Ahmed Ouameur, M. & Massicotte, D. Deep Unfolded Extended Conjugate Gradient Method for Massive MIMO Processing with Application to Reciprocity Calibration. J Sign Process Syst 93, 965–975 (2021). https://doi.org/10.1007/s11265-020-01631-1

Download citation

Received: 03 July 2019
Revised: 12 February 2020
Accepted: 20 December 2020
Published: 06 February 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s11265-020-01631-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Unfolded Extended Conjugate Gradient Method for Massive MIMO Processing with Application to Reciprocity Calibration

Abstract

Access this article

Similar content being viewed by others

Unsupervised Deep Learning-Based Hybrid Beamforming in Massive MISO Systems

Hybrid Precoding Schemes for mmWave Massive MIMO Systems—A Comprehensive Survey

Deep Ridge Regression Neural Network-based hybrid precoder and combiner design

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendix 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deep Unfolded Extended Conjugate Gradient Method for Massive MIMO Processing with Application to Reciprocity Calibration

Abstract

Access this article

Similar content being viewed by others

Unsupervised Deep Learning-Based Hybrid Beamforming in Massive MISO Systems

Hybrid Precoding Schemes for mmWave Massive MIMO Systems—A Comprehensive Survey

Deep Ridge Regression Neural Network-based hybrid precoder and combiner design

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendix 1

Appendix 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation