Elsevier

Signal Processing

Volume 89, Issue 1, January 2009, Pages 1-13
Signal Processing

Estimating MIMO channel covariances from training data under the Kronecker model

https://doi.org/10.1016/j.sigpro.2008.06.014Get rights and content

Abstract

Many algorithms for transmission in multiple input multiple output (MIMO) communication systems rely on second order statistics of the channel realizations. The problem of estimating such second order statistics of MIMO channels, based on limited amounts of training data, is treated in this article. It is assumed that the Kronecker model holds. This implies that the channel covariance is the Kronecker product of one covariance matrix that is associated with the array and the scattering at the transmitter and one that is associated with the receive array and the scattering at the receiver. The proposed estimator uses training data from a number of signal blocks (received during independent fades of the MIMO channel) to compute the estimate. This is in contrast to methods that assume that the channel realizations are directly available, or possible to estimate almost without error. It is also demonstrated how methods that make use of the training data indirectly via channel estimates can be biased.

An estimator is derived that can, in an asymptotically optimal way, use, not only the structure implied by the Kronecker assumption, but also linear structure on the transmit- and receive covariance matrices. The performance of the proposed estimator is analyzed and numerical simulations illustrate the results and also provide insight into the small sample behaviour of the proposed method.

Introduction

Employing multiple antennas at the receiver and at the transmitter in a wireless communications system allows for higher reliable throughput without increasing total transmitted power or the spectral usage [1], [2], [3]. An overview of research on such multiple input multiple output (MIMO) communications systems is given in [4].

Obviously, the characteristics of MIMO radio channels are critical to the design and analysis of algorithms for transmission. MIMO channel modelling is treated in [4], [5]. Typically, the (baseband) narrowband MIMO channel is modelled asy(i)=Htx(i)+n(i),where Ht denotes the m×n channel matrix, x(i) is the transmitted symbols at time i and y(i) is the received vector at time i. A common simplifying modelling assumption is that Ht is constant during the transmission of a block of symbols and that the elements of Ht are stochastic, complex Gaussian, and independent between the signal blocks. This model leaves important aspects of the MIMO channel unmodelled, e.g., shadow fading, path loss, and slowly time varying properties, see, e.g., [6]. This article will, however, focus on the properties of Ht.

Typically, it is assumed that the receiver can estimate the (downlink) MIMO channel in each signal block. This is made possible by inserting known training data symbols in the symbol stream. In a fast fading scenario, the signal blocks where the channel can be assumed constant are relatively short; training data symbols must then be inserted in the data stream frequently in order to enable accurate channel knowledge at the receiver.

Achieving channel knowledge at the transmitter is more problematic. In a time division duplex (TDD) system, channel reciprocity may be used to compute the downlink channel from the uplink channel. Typically, however, channel knowledge at the transmitter requires feedback from the receiver; in a fast fading scenario, feeding the entire channel matrix back would require a prohibitively large bandwidth. In such a scenario, the transmitter might rely on channel statistics instead of on the actual channel realizations. The receiver then estimates the statistics and feeds it back to the transmitter, or, in a TDD setting, the estimation may be performed at the transmitter based on uplink data. The critical assumption made is that the channel statistics contains useful spatial information on the channel quality but changes at a much slower rate than the actual channel.

The Kronecker model assumes that the channel covariance is structured according toCovvec{Ht}=AB,where A is an n×n transmit covariance matrix, and B is an m×m receive covariance matrix [7], [8], [9]. Furthermore, denotes Kronecker matrix product and vec{} denotes the vectorization operator (see, e.g., [10]). Estimating such covariance matrices is useful in the design and analysis of algorithms for MIMO communications.

Several signal processing results assume that channel statistics are known at the transmitter; the Kronecker model is then typically assumed [11], [12], [13]. Imposing the structure implied by the Kronecker assumption not only allows for a reduced complexity in the transmit and receive algorithms, but also reduces the number of parameters needed when feeding back channel statistics. Furthermore, imposing the structure leads to more accurate estimates of the channel covariance.

As mentioned above, in a typical communication system, the receiver estimates the channel statistics based on a number of channel realizations. The channel realizations are in turn estimated from training data. If the amount of training data available in each block is very large, or the signal-to-noise ratio (SNR) is high, then the channel estimates can be assumed to be identical to the true underlying channel and methods such as those of [14] or [15] can be used to calculate the covariance matrix estimate. In contrast, here we instead consider the case where the amount of training data is limited so that channel realizations cannot be assumed perfectly known.

It is important to transmit as little training data as possible in each signal block in order to preserve spectral efficiency. This means that the assumption of perfect channel knowledge at the receiver may be violated. In such a scenario, where the channel estimates cannot be assumed perfect, this should be taken into account in the design of the estimator of the channel statistics. The input to the estimator should be the training data, not the channel estimates. Here we present a new method for the estimation problem based on a covariance matching criterion. The method is non-iterative and thus has a fixed computational complexity. It is also asymptotically statistically efficient. Furthermore, it can in an efficient way take advantage of certain structure of the A and B matrices. If, for example, the antenna array at the receiver or transmitter is a uniform linear array (ULA), then the corresponding covariance matrices sometimes are assumed to have a Toeplitz structure.

The article is organized as follows: The data model assumed for the considered signal processing problem is given a detailed description in Section 2. Some initial approaches to the estimation problem are introduced for the purpose of illustration in Section 3. The Cramér–Rao bound for this data model is derived in Section 4. The covariance matching algorithm that is the main topic of this work is presented in Section 5.1. It is shown how the criterion leads to a weighted low rank approximation (WLRA) problem. This WLRA problem is the topic of Section 5.2. A practical approximative solution is given; the resulting estimate retains the asymptotic properties of the optimal solution. An expression for the asymptotic covariance of the covariance matching estimate is derived in Section 5.3. Practical issues related to implementation, including the computational complexity, are treated in Section 6. The problem of testing whether a given set of data is consistent with the proposed data model is treated in Section 7. The article is concluded with numerical experiments in Section 8 and a summary is given in Section 9.

Throughout the article, the following notation is used: ϒ denotes the Moore–Penrose pseudoinverse of the matrix ϒ. Also define the orthogonal projection matrix onto the (column) null-space of X as ΠX=I-XX. The i,jth element of the matrix ϒ is denoted [ϒ]ij. The superscript * denotes conjugate transpose and T denotes transpose. Also we denote the conjugate of a matrix by conj{ϒ}=ϒT*. The notations X-2=X-1X-1 and X-T=(X-1)T are extensively used. The Hermitian square root of a positive definite (p.d.) matrix is denoted X1/2. The notation ϒ˙j denotes the element-wise derivative of the matrix ϒ w.r.t. the parameter at the jth position in the parameter vector in question. The weighted quadratic matrix norm ϒQ2=vec*{ϒ}Qvec{ϒ} will be used frequently. Finally, xN=op(aN) means that limNxN/aN=0 in probability while xN=Op(aN) means that xN/aN is bounded in probability. In this work the asymptotic results hold when the number of signal blocks, N, tends to infinity.

Section snippets

Data model

Denote the MIMO channel during signal block t by Ht. Assume that the same sequence of p training symbols [x(i)]i=0p-1 is sent once as part of each signal block. The corresponding received data in signal block t can then be modelled asyt(i)=Htx(i)+et(i),i=0,,p-1.It is assumed that the number of training symbols is larger than the number of transmit antennas, p>n. This is necessary to ensure identifiability. The noise et(i) is assumed to be zero mean complex Gaussian with covariance matrixEes(k)e

Direct approaches

The channel is typically estimated by least squares from the training data similar toH^t=YtX,t=0,1,,N-1.When channel statistics are available, a Wiener estimate can also be used. In a scenario with high SNR or when the amount of training data is large, the estimates H^t can be assumed to be identical to Ht and the methods in [14], [15] can be applied to estimate RH. If the SNR is low or the amount of training data is limited, then some care must be taken. It is, for example, easy to see that

Cramér–Rao lower bound

In order to state the result of this section, consider the rearrangement function [20]Rearr(RH)=[vec{RH11}vec{RHn1}vec{RH12}vec{RHn2}vec{RHnn}]T,where RHkl is the k,lth m×m block of RH. This rearrangement function has the property Rearr(AB)=vec{A}vecT{B}. It is easy to see that a permutation matrix, PR, can be defined such thatvec{RH}=PRvec{Rearr(RH)}for any matrix RH of compatible dimensions.

Theorem 1

The covariance matrix of any unbiased estimator R^H of A0B0 in the data model described in Section

Derivation of the estimator

As will be shown further on (Theorem 3) in the article, the estimates obtained by minimizing the covariance matching criterionV¯C(θ,σ2)=R^-σ2Imp-ΨRH(θ)Ψ*W^2w.r.t. θA, θB and σ2 are asymptotically efficient ifW^=W+op(1),W=1N(CovR^)-1=(R0-TR0-1),i.e. W^ is a consistent estimate of W. This result is not surprising, especially in the light of the extended invariance principle [21], [22]. Covariance matching estimators have been proposed previously for different data models in signal processing,

Implementing the covariance matching estimator

The approximate solution to the WLRA problem (38) that we suggested in Theorem 2 uses an initial value, θ¯, of the parameter vector. Several strategies are possible for choosing θ¯. One possibility is to use an optimal unweighted (setting Ω=InA+nB in (38)) rank one approximation of Φ^ to get the initial values. In this initial step, Q^ could be set to identity. Then the expression for Φ^ simplifies significantly as will be shown. Another possibility is of course to use the estimates obtained

Model validation

It is possible to use the minimum value of the criterion function from Section 5.1 in an algorithm for testing if a given set of data is consistent with the Kronecker product model (with a given structure on the transmit and receive covariances). An important question is then which statistical distribution this test statistic has. The answer is given in the following theorem:

Theorem 4

Let θ^ be the minimizer of the criterion function VC(θ)=VC(Φ) in (38). Then, under the data model described in Section 2

Simulation setup

Monte Carlo simulations are used to evaluate the small sample performance of the proposed estimator. Two matrices A0 and B0 are generated according to an exponential model[A]kl=0.9|k-l|exp{j(k-l)π/4},[B]kl=0.7|k-l|exp{j(k-l)π/3}and the corresponding RH(θ0) is calculated. This structure is Hermitian and Toeplitz, but the Toeplitz structure is not imposed in all experiments. It was assumed that training symbols were transmitted at one antenna at a time, while the other antennas were quiet. The

Conclusion

This article considers the problem of estimating second order channel statistics for MIMO channels. It is assumed that training data from a number of signal blocks is available as opposed to perfect observations of the channel realizations. The Cramér-Rao bound associated with the data model is derived. The CRB is reached asymptotically by an estimator that employs a statistically optimal weighting in a covariance matching criterion where the structure of the problem is imposed. Computing the

References (33)

  • P. Stoica et al.

    On reparameterization of loss functions used in estimation and the invariance principle

    Signal Processing

    (1989)
  • B. Ottersten et al.

    Covariance matching estimation techniques for array signal processing applications

    Digital Signal Processing

    (1998)
  • I.E. Telatar

    Capacity of multi-antenna Gaussian channels

    Eur. Trans. Telecomm.

    (1999)
  • G. Foschini et al.

    On limits of wireless communications in a fading environment when using multiple antennas

    Wireless Personal Comm.

    (1998)
  • L. Zheng et al.

    Diversity and multiplexing: a fundamental tradeoff in multiple-antenna channels

    IEEE Trans. Inform. Theory

    (2003)
  • D. Gesbert et al.

    From theory to practice: an overview of MIMO space-time coded wireless systems

    IEEE J. Selected Areas Comm.

    (2003)
  • K. Yu et al.

    Models for MIMO propagation channels: a review

    Wireless Comm. Mobile Comput.

    (2002)
  • A. Molisch

    A generic model for MIMO wireless propagation channels in macro- and microcells

    IEEE Trans. Signal Process.

    (2004)
  • J. Kermoal et al.

    A stochastic MIMO radio channel model with experimental validation

    IEEE J. Selected Areas Comm.

    (2002)
  • K. Yu et al.

    Modeling of wide-band MIMO radio channels based on NLoS indoor measurements

    IEEE Trans. Vehicular Technol.

    (2004)
  • M. Bengtsson, P. Zetterberg, Some notes on the Kronecker model, EURASIP J. Wireless Comm. Network., submitted for...
  • R.A. Horn et al.

    Topics in Matrix Analysis

    (1991)
  • S. Da-Shan et al.

    Fading correlation and its effect on the capacity of multielement antenna systems

    IEEE Trans. Signal Process.

    (2000)
  • C. Martin et al.

    Asymptotic eigenvalue distributions and capacity for MIMO channels under correlated fading

    IEEE Trans. Signal Process.

    (2004)
  • E. Jorswieck et al.

    Channel capacity and capacity-range of beamforming in MIMO wireless systems under correlated fading with covariance feedback

    IEEE Trans. Wireless Comm.

    (2004)
  • K. Werner, M. Jansson, Weighted low rank approximation and reduced rank linear regression, in: IEEE International...
  • Cited by (26)

    • Separable linear discriminant analysis

      2012, Computational Statistics and Data Analysis
      Citation Excerpt :

      For instance, two-dimensional principal component analysis for face image data (Dryden et al., 2009) and mixture modeling for handwritten digit images (Viroli, 2011). Other applications where the structure of the problem suggests such an assumption include spatial–temporal modeling for environmental data (Mardia and Goodall, 1993), channel modeling for multiple-input multiple-out communications (Werner and Jansson, 2009), signal modeling of MEG/EEG data (de Munck et al., 2002), etc. The proof is easily concluded by comparing SLDA by ME in Section 3.3 with bidirectional 2DLDA in Section 2.3.

    View all citing articles on Scopus
    1

    Karl Werner is with Ericsson Research, Ericsson AB.

    View full text