1 Introduction

The task of efficient MIMO detection for the large number of antennas raised up already in LTE networks. However, this problem is even more crucial for the emerging 5G networks where MIMO channels with 100 or even more transmitting and reviving antennas can be used [1, 2].

The main role of MIMO detector (or MIMO equalizer) is to produce an estimation of probabilities of transmitted symbols and corresponding bits. It is an important part of MIMO receiver. Since the introduction of V-Blast [3, 4], dozens of algorithms were proposed for MIMO detection. The algorithms differ in performance and complexity. Surveying this big set of detection approaches one can observe that Maximum Likelihood (ML) or Maximum a Posteriori (MAP) MIMO detector demonstrates the best (optimal) performance but at the cost of very high-complexity. The number of operations shows an exponential growth with regard to the number of transmitting antennas. High complexity makes ML hardly implementable especially for the case of a large number of transmitting and receiving antennas. However, ML detection represents only one class of algorithms. On the other hand, the complexity of Zero Forcing (ZF) and MMSE detectors is characterized by \(O(N^3)\), where N is the rank of the channel matrix. Of course, complexity reduction is accompanied with performance degradation, which can achieve 5 dB and even more in the case of large-size MIMO systems. There are many sub-optimal algorithms proposed in the literature, such as Spherical Decoder (SD) [5], and M-algorithm combined with QR Decomposition (QRD-M) [6]. Their performance is comparable with ML but the complexity is significantly reduced. One can also find modifications of the MMSE algorithm, such as MMSE with Sorted QR Decomposition (MMSE SQRD) [7], and ordered layers cancellation algorithms [8,9,10,11]. Their performance is improved compared with MMSE at the expense of relatively small complexity growth. Still, there is performance gap (a few dBs) between MMSE based algorithms and ML or sub-ML. In [12] it was proposed to use the combination of ZF Ordered Successive Interference Cancellation (ZF OSIC) scheme and ML. ZF OSIC is applied to strong layers while ML is used for remaining weak layers. Such approach is intended to provide the proper performance/complexity trade-off by combining the schemes with different characteristics. Another efficient approach for Parallel Interference Cansellation (PIC) together with MMSE detection is proposed in [13].

To avoid misunderstanding and possible criticism about the way we have classified the algorithms by their complexity and performance we have to note that in some conditions, such as high signal-to-noise ratio (SNR), the complexity of sub-ML algorithms (e.g. SD or QRD-M) looks to be comparable with MMSE. However, if we consider all possible SNRs, one can observe that as SNR becomes smaller and the spherical radius (or parameter M in QRD-M) should be increased in order to keep the performance to be close to ML. Thus, the complexity increases. On the other hand, when M is kept the same as for the high SNR, the performance of sub-ML algorithms experiences serious degradation and loses to MMSE.

In this paper, we present a new detection scheme which is based on the MMSE solution. We called the scheme Turbo-MMSE because it utilizes the approach of extrinsic information extraction like it is done in turbo decoding. It is shown in our study that the complexity of Turbo-MMSE is comparable with MMSE but the performance can be much better than MMSE and its improved modifications [9,10,11]. Moreover, the new scheme assumes parallel layer processing resulting in the reduced calculation time.

The remaining parts of this paper are organized as follows. In Sect. 2, we briefly present the classical model of the MIMO-OFDM system. In Sect. 3, we review the most efficient MMSE-based schemes, which were presented before. In Sect. 4, we introduce the new Turbo-MMSE scheme and describe its difference with other MMSE-based schemes. In Sect. 5, we present simulation results and compare the performance of different MMSE based schemes. Finally, we conclude this paper in Sect. 6.

2 System Description

The typical multi-carrier MIMO system is described by matrix equation in frequency domain:

$$\begin{aligned} \mathbf {y} = \mathbf {H}\mathbf {x} + \varvec{\eta }, \end{aligned}$$
(1)

where \(\mathbf {x}={{({{x}_{1}},\ldots ,{{x}_{M}})}^{T}}\) is the transmitted signal vector, \(\mathbf {y}={{({{y}_{1}},\ldots ,{{y}_{N}})}^{T}}\) is received signal vector, \(\varvec{\eta }={{({{\eta }_{1}},\ldots ,{{\eta }_{N}})}^{T}}\) is additive noise vector, \(\mathbf {H}\) is channel matrix of size \(N\times M\), all variables are complex. Equation (1) describes signal transmission through flat MIMO channel, which is typical case for most MIMO-OFDM systems utilized in most broadband wireless standards. In the simulations we will use MIMO transmitting schemes defined for LTE-A (see [14,15,16]) with channel models defined in [17].

3 MMSE-Based Layer Cancellation MIMO Detectors

MMSE-based Ordered Successive Interference Cancellation detector (MMSE OSIC) was introduced in [8]. Further, its improved version with soft output was defined in [9, 11]. MMSE OSIC makes interference cancellation layer-by-layer in the order defined by layer ordering procedure. Layer ordering is done according to maximal post-detection SNR or to maximal signal power.

Layer cancellation means subtraction of signal transmitted on a layer from the received signal \(\mathbf {y}\). After cancellation of k layers, the received signal contains the data transmitted on unprocessed layers, additive noise \(\varvec{\eta }\) and also extra noise caused by improperly detected symbols on canceled layers. Received signal with subtracted layers is defined by equation

$$\begin{aligned} \tilde{\mathbf {y}}={{\mathbf {h}}_{k+1}}{{x}_{k+1}}+\underbrace{\sum \limits _{i=k+2}^{M}{{\mathbf {h}_{i}}{{x}_{i}}}}_{{\mathbf {I}_{u}}}+\underbrace{\sum \limits _{i=1}^{k}{{\mathbf {h}_{i}}({{x}_{i}}-{{{\overset{\scriptscriptstyle \smile }{x}}}_{i}})}}_{{\mathbf {I}_{c}}}+\varvec{\eta }, \end{aligned}$$
(2)

where \({\mathbf {I}_{u}}\) defines the interference from undetected layers, \({\mathbf {I}_{c}}\) defines the interference from detected and canceled layers due to decision errors, \({{x}_{i}}\) is a true symbol transmitted on layer i, while \({{\overset{\scriptscriptstyle \smile }{x}}_{i}}\) is the estimation of \({{x}_{i}}\). For simplicity, we assume that Eq. (2) corresponds to already sorted layers, i.e. matrix columns \({\mathbf {h}_{1}},\ldots ,{\mathbf {h}_{M}}\) represent sorted layers in the order from best to worst. Solution \(x_{k+1}\) for the current layer \(k+1\) is defined by MMSE filter

$$\begin{aligned} \mathbf {g} = \mathbf {h}_{k+1}^{H}{{\left[ {\mathbf {H}_{k+2:M}}\mathbf {H}_{k+2:M}^{H}+\frac{1}{\sigma _{s}^{2}}{\mathbf {R}_{{{I}_{D}}}}+\frac{\sigma _{\eta }^{2}}{\sigma _{s}^{2}} {\mathbf {I}_{N}} \right] }^{-1}}, \end{aligned}$$
(3)

where matrix \({\mathbf {H}_{k+2:M}}\) is the sub-matrix consisting of columns corresponding to undetected layers, \( (\cdot )^H\) is the sign of Hermitean conjugation, \(\mathbf {R}_{I_{D}}\) is covariance matrix defined by decision errors of previously processed layers, \({\mathbf {I}_{N}}\) is identity matrix, \(\sigma _{s}^{2}\) and \(\sigma _{\eta }^{2}\) are variances characterizing signal and noise powers, respectively. Covariance matrix \({\mathbf {R}_{{{I}_{D}}}}\) is defined by equation

$$\begin{aligned} {\mathbf {R}_{{{I}_{D}}}}={\mathbf {H}_{1:k}}{\mathbf {Q}_{{{D}_{e}}}}\mathbf {H}_{1:k}^{H}, \end{aligned}$$
(4)

where \({\mathbf {H}_{1:k}}\) is sub-matrix consisting of columns corresponding to already detected layers, \({\mathbf {Q}_{{{D}_{e}}}}\) is covariance matrix characterizing detected symbols errors (i.e. the difference between true symbols and symbol estimations obtained within detection). The elements of \({\mathbf {Q}_{{{D}_{e}}}}\) are defined as follows:

(5)

Estimation \({{\overset{\scriptscriptstyle \smile }{x}}_{i}}\) is defined based on detected symbol probability in accordance to soft symbol detection procedure:

$$\begin{aligned} {{\overset{\scriptscriptstyle \smile }{x}}_{i}}=\sum \limits _{p=1}^{{{N}_{c}}}{{{S}_{p}}\mathrm {Pr}({{x}_{i}}=}{{S}_{p}} \Big | {{{\hat{x}}}_{i}},{{\mu }_{i}}), \end{aligned}$$
(6)

where \({{N}_{c}}\) is the number of points in QAM constellation, \(\mathrm {Pr}({{x}_{i}}={{S}_{p}} \Big | {{{\hat{x}}}_{i}},{{\mu }_{i}})\) is probability that symbol \({{x}_{i}}\) transmitted on layer i belongs to constellation point \({{S}_{p}}\). The probability is defined by unbiased MMSE estimation characterized by equivalent Gaussian channel approximation:

$$\begin{aligned} {{\hat{x}}_{i}} = \mathbf {g}\tilde{\mathbf {y}}={{\mu }_{i}}x_i+{{\eta }_{i}} \end{aligned}$$
(7)

where wight coefficient \({{\mu }_{i}}=\mathbf {g}{\mathbf {h}_{i}}\), and \({{\eta }_{i}}\) has Gaussian distribution \(\mathbb {N}(0,\sigma _{{{\eta }_{i}}}^{2})\) with 0 mean and variance \(\sigma _{{{\eta }_{i}}}^{2}=({{\mu }_{i}}-\mu _{i}^{2})\sigma _{s}^{2}\).

Note that instead of soft symbol \({{\overset{\scriptscriptstyle \smile }{x}}_{i}}\) it is possible to use the most probable symbol in the constellation. The elements \({{q}_{u,v}}\) of the covariance matrix are also defined by the probability model (7).

Matrix inversion in (3) is necessary to get the MMSE filter for each transmission layer. A simplified version of the matrix \({\mathbf {Q}_{{{D}_{e}}}}\) was proposed in [10]. It has all elements equal to zero except the main diagonal. As it was proven by simulations, such simplification leads to practically negligible performance degradation compared to full covariance matrix defined by (5).

It should be noted that in [9, 10] the authors compare their low-complexity scheme of matrix inversion to high-quality but rather complex scheme based on Singular Value Decomposition (SVD). Therefore, the conclusion about very small complexity growth regarding the total complexity of the MMSE algorithm is made. If we compare, however, proposed matrix inversion scheme with more calculation efficient approaches, for example, with the inverse of partitioned matrices (see [18]), we can find out that the complexity of simplified MMSE OSIC is about 2–3 times the complexity of MMSE. For sure it is just a rough estimation and one can make more accurate calculations taking into account extra operations necessary for layers sorting, soft symbol detection and layers cancellation.

Finalizing the observation of MMSE OSIC, we can conclude that described algorithm provides improved BLock Error Rate (BLER) performance compared with MMSE while its complexity increases not more than 2–3 times. However, the improved performance is still rather far away from optimal ML. In the next section, we will present new iteration algorithm which demonstrates improved performance while its complexity is just 2–2.5 times more than MMSE.

4 New Interleave MIMO Detector Scheme with MMSE Kernel

The advantage of the proposed algorithm is in Parallel Interference Cancellation (PIC), which is successively performed on each iteration for all transmitted layers. Before starting with the algorithm itself, we will introduce the procedure for getting extrinsic information.

Let’s suppose we estimate some parameter x, which has Gaussian distribution and for which we have a priori information about its mean value \({{\bar{x}}_{pr}}\) and variance \(\sigma _{pr}^{2}\), i.e. \(x \sim \mathbb {N}({{\bar{x}}_{pr}},\sigma _{pr}^{2})\). Suppose additionally that we obtained a new measurement and defined a new estimation \({{\bar{x}}_{ms}}\):

$$\begin{aligned} {{\bar{x}}_{ms}} = x + \eta , \end{aligned}$$
(8)

where \(\eta \sim \mathbb {N}(0,\sigma _{ms}^{2})\) is estimation error. Based on these data we can get posterior Probability Density Function (PDF) for x:

$$\begin{aligned} {\mathrm {Pr}_{pos}}(x)={{C}_{1}}\exp \left( -\frac{{{(x-{{{\bar{x}}}_{pr}})}^{2}}}{2\sigma _{pr}^{2}}-\frac{{{(x-{{{\bar{x}}}_{ms}})}^{2}}}{2\sigma _{ms}^{2}} \right) , \end{aligned}$$
(9)

where \({{C}_{1}}\) is normalization constant. Transforming the expression inside the exponent above we get:

$$\begin{aligned} \begin{aligned} {\mathrm {Pr}_{pos}}(x) = {{C}_{1}}\exp \left( -\frac{{{x}^{2}}-2x\frac{{{{\bar{x}}}_{pr}}\sigma _{ms}^{2}+{{{\bar{x}}}_{ms}}\sigma _{pr}^{2}}{\sigma _{ms}^{2}+\sigma _{pr}^{2}}}{2\frac{\sigma _{pr}^{2}\sigma _{ms}^{2}}{\sigma _{ms}^{2}+\sigma _{pr}^{2}}} \right) \exp \left( -\frac{{{{\bar{x}}}_{pr}}^{2}\sigma _{ms}^{2}+\bar{x}_{ms}^{2}\sigma _{pr}^{2}}{2\sigma _{ms}^{2}\sigma _{pr}^{2}} \right) . \end{aligned} \end{aligned}$$
(10)

It is easy to see that (10) is still Gaussian PDF with variance and mean value defined by (11):

$$\begin{aligned} \begin{aligned} \sigma _{pos}^{2}&= \frac{\sigma _{pr}^{2}\sigma _{ms}^{2}}{\sigma _{pr}^{2}+\sigma _{ms}^{2}}, \\ {{{\bar{x}}}_{pos}}&= {{{\bar{x}}}_{pr}}\frac{\sigma _{ms}^{2}}{\sigma _{pr}^{2}+\sigma _{ms}^{2}}+{{{\bar{x}}}_{ms}}\frac{\sigma _{pr}^{2}}{\sigma _{pr}^{2}+\sigma _{ms}^{2}}. \end{aligned} \end{aligned}$$
(11)

Therefore,

$$\begin{aligned} {\mathrm {Pr}_{pos}}(x) = {{C}_{1}}\exp \left( -\frac{{{(x-{{{\bar{x}}}_{pos}})}^{2}}}{2\sigma _{pos}^{2}} \right) \exp \left( -\frac{{{{\bar{x}}}_{pr}}^{2}\sigma _{ms}^{2}+\bar{x}_{ms}^{2}\sigma _{pr}^{2}}{2\sigma _{pr}^{2}\sigma _{ms}^{2}} \right) \times \\ \exp \left( \frac{{{({{{\bar{x}}}_{pr}}\sigma _{ms}^{2}+{{{\bar{x}}}_{ms}}\sigma _{pr}^{2})}^{2}}}{2\sigma _{pr}^{2}\sigma _{ms}^{2}(\sigma _{pr}^{2}+\sigma _{ms}^{2})} \right) = {{C}_{2}}\exp \left( -\frac{{{(x-{{{\bar{x}}}_{pos}})}^{2}}}{2\sigma _{pos}^{2}} \right) . \end{aligned}$$
(12)

Correspondingly, in the case when we are solving backward problem of getting extrinsic information from the posterior PDF, we obtain:

$$\begin{aligned} \begin{aligned} \sigma _{ms}^{2}&=\frac{\sigma _{pr}^{2}\sigma _{pos}^{2}}{\sigma _{pr}^{2}-\sigma _{pos}^{2}}, \\ {{{\bar{x}}}_{ms}}&=-{{{\bar{x}}}_{pr}}\frac{\sigma _{pos}^{2}}{\sigma _{pr}^{2}-\sigma _{pos}^{2}}+{{{\bar{x}}}_{pos}}\frac{\sigma _{pr}^{2}}{\sigma _{pr}^{2}-\sigma _{pos}^{2}}. \end{aligned} \end{aligned}$$
(13)

From this point, we can continue with the description of iterative detection. The block diagram of the proposed algorithm is shown in Fig. 1.

Fig. 1.
figure 1

Iterative detection using MMSE kernel.

MMSE detector estimates \({{\hat{\mathbf {x}}}_{\text {MMSE}}}\) and solution variance \(\varvec{\sigma }_{\text {MMSE}}^{2}\) based on a priori data obtained on the previous iteration, where a priori data are defined as mean value \({{\bar{\mathbf {x}}}_{pr}}\) and variance \(\varvec{\sigma }_{pr}^{2}\). MMSE detector produces estimation in accordance to equations:

$$\begin{aligned} \begin{aligned} {{{\hat{\mathbf {x}}}}_{\text {MMSE}}}&={{{\bar{\mathbf {x}}}}_{pr}}+{\mathbf {g}_{\text {MMSE}\_\Pr }}\left( \mathbf {Y}-\mathbf {H}{{{\bar{\mathbf {x}}}}_{pr}}\right) , \\ {\mathbf {g}_{\text {MMSE}\_\Pr }}&= {\mathbf {V}_{pr}} {\mathbf {H}^{H}} {{\left( \mathbf {H}{\mathbf {V}_{pr}}{\mathbf {H}^{H}}+\sigma _{\eta }^{2}\mathbf {I}\right) }^{-1}}, \\ {\mathbf {V}_{pr}}&=diag(\varvec{\sigma }_{pr}^{2}), \\ \varvec{\sigma }_{\text {MMSE}}^{2}&=diag({\mathbf {V}_{\text {MMSE}}}) = diag({\mathbf {V}_{pr}}-{\mathbf {G}_{\text {MMSE}\_\Pr }}\mathbf {H}{\mathbf {V}_{pr}}). \end{aligned} \end{aligned}$$
(14)

Note that in (14) \(\varvec{\sigma }_{\text {MMSE}}^{2}\), \(\varvec{\sigma }_{pr}^{2}\) are vectors, \({\mathbf {V}_{pr}}\) is a diagonal matrix containing variances \(\varvec{\sigma }_{pr}^{2}\) for each layer of estimated vector \(\mathbf {x}\), correspondingly \(\varvec{\sigma }_{\text {MMSE}}^{2}\) is the vector of variances obtained from the diagonal of matrix \({\mathbf {V}_{\text {MMSE}}}\). Considering MMSE estimation as posterior PDF, which is assumed to be Gaussian, and recalling conclusions of (13), we can extract extrinsic data. It will be used as a new observation on the next processing step. Thus, following (13) we define each component m of vectors \({{\bar{\mathbf {x}}}_{ms}}\), \(\varvec{\sigma }_{ms}^{2}\):

$$\begin{aligned} \begin{aligned} {{{\bar{x}}}_{ms,m}}&=-\frac{\sigma _{\text {MMSE},m}^{2}}{\sigma _{pr,m}^{2}-\sigma _{\text {MMSE},m}^{2}}{{{\bar{x}}}_{pr,m}} + \frac{\sigma _{pr,m}^{2}}{\sigma _{pr,m}^{2}-\sigma _{\text {MMSE},m}^{2}}{{{\hat{x}}}_{\text {MMSE},m}}, \\ \sigma _{ms,m}^{2}&=\frac{\sigma _{pr,m}^{2}\sigma _{\text {MMSE},m}^{2}}{\sigma _{pr,m}^{2}-\sigma _{\text {MMSE},m}^{2}}. \end{aligned} \end{aligned}$$
(15)

One can observe that (15) is equivalent to unbiased MMSE estimation defined by (7). Soft QAM demapper produces Log-Likelihood Ratio (LLR) estimations based on obtained \({{\bar{\mathbf {x}}}_{ms}}\), \(\varvec{\sigma }_{ms}^{2}\):

$$\begin{aligned}&L({{b}_{m,l}})=\ln \left( \frac{\sum \limits _{{{s}_{1}}\in s({{b}_{m,l}}=1)}{\exp (-\frac{{{({{{\bar{x}}}_{ms,m}}-{{s}_{1}})}^{2}}}{2\sigma _{ms,m}^{2}})}}{\sum \limits _{{{s}_{0}}\in s({{b}_{m,l}}=0)}{\exp (-\frac{{{({{{\bar{x}}}_{ms,m}}-{{s}_{0}})}^{2}}}{2\sigma _{ms,m}^{2}})}} \right) , \end{aligned}$$
(16)

where \(s(b_{m,l}=1,0)\) is the set of all symbols from QAM constellation S of order L with bit \(b_{m,l} = 1,0\) in the l-th position of m-th symbol, and \(l \in (1,\ldots , L)\), \(m\in (1,\ldots , M)\).

These data are directed to the output when iterations loop is over. Otherwise, the data are used as an input to the re-modulation (or re-mapping) processing module. Re-modulator produces soft symbol estimations based on bit probabilities:

$$\begin{aligned} \begin{aligned} {{{\bar{x}}}_{ps,m}}&=\sum \limits _{S}{s({{b}_{m,1}},}\ldots ,{{b}_{m,L}})\prod \limits _{t=1}^{L}{\Pr ({{b}_{m,t}})}, \\ \sigma _{ps,m}^{2}&=\sum \limits _{S}{|s({{b}_{m,1}},}\ldots ,{{b}_{m,L}}){{|}^{2}}\prod \limits _{t=1}^{L}{\Pr ({{b}_{m,t}})}-|{{{\bar{x}}}_{ps,m}}{{|}^{2}}, \end{aligned} \end{aligned}$$
(17)

where \(s(b_{m,1},\ldots , b_{m,L})\) is m-th QAM symbol and

$$\begin{aligned} \Pr (b_{m,t}) = \frac{\exp \left( (-1)^{b_{m,t}+1}\frac{L_{m,t}}{2}\right) }{\exp \left( \frac{L_{m,t}}{2}\right) + \exp \left( \frac{-L_{m,t}}{2}\right) }. \end{aligned}$$

Summation in (17) is done for all constellation points (all bit combinations) of layer m. Note that (17) is equivalent to (5)–(6). If it is not the last iteration, we can calculate symbol probabilities instead of bit probabilities in the same way as it was done in OSIC. It is important that (17) as well as (5) can provide different variances for in-phase and quadrature components of soft QAM symbols. Due to this reason, MMSE solutions at all iterations, except the first one, must be obtained for real-valued signal equation. Therefore, complex matrix \(\mathbf {H}\) is substituted by its real-valued representation:

The same procedure should be done in MMSE OSIC algorithm after cancellation of layer number 1.

Considering \({{\bar{\mathbf {x}}}_{ps}}\) and \(\varvec{\sigma }_{ps}^{2}\) as posterior PDF with a priori data \({{\bar{\mathbf {x}}}_{ms}}\), \(\varvec{\sigma }_{ms}^{2}\), we can extract extrinsic data applying (13):

$$\begin{aligned} \begin{aligned} {{{\bar{x}}}_{pr,m}}&=-\frac{\sigma _{ps,m}^{2}}{\sigma _{ms,m}^{2}-\sigma _{ps,m}^{2}}{{{\bar{x}}}_{ms,m}}+ \frac{\sigma _{ms,m}^{2}}{\sigma _{ms,m}^{2}-\sigma _{ps,m}^{2}}{{{\bar{x}}}_{ps,m}}, \\ \sigma _{pr,m}^{2}&=\frac{\sigma _{ps}^{2}\sigma _{ms,m}^{2}}{\sigma _{ms,m}^{2}-\sigma _{ps,m}^{2}}. \end{aligned} \end{aligned}$$
(18)

These data are used in MMSE detection in the next iteration cycle. In the first iteration it is assumed that \(\sigma _{pr,m}^{2}=\sigma _{{{s}_{m}}}^{2}\), \({{x}_{pr,m}}=0\), so MMSE filter works in the same way as in the conventional MMSE detector.

To avoid negative variances in (18), it is necessary to check the condition \(\sigma _{ps,m}^{2}<\sigma _{ms,m}^{2}\). If it is not true, then iteration for that particular layer does not bring accuracy improvement, and this result should be discharged, i.e. on the next iteration a priori data for the layer m are the same as in the iteration before. In most cases the proposed method, that we called “Turbo MMSE”, requires just two iterations. Further iterations do not bring noticeable performance improvement. Each iteration of Turbo MMSE consists of the following steps: getting MMSE filter, obtaining MMSE estimation, extracting unbiased MMSE estimation, making re-modulation and extracting a priori data, subtract a priori data from received vector. Taking into account that the first 3 steps are the same as in conventional MMSE, while the last two are quite simple and performed only once, we conclude that complexity of proposed scheme is 2–2.5 times higher the complexity of MMSE.

5 Simulation Results

Figures 2, 3, and 4 demonstrate the performance of developed algorithm. Simulations were performed for LTE-A downlink model \(8 \times 8\) MIMO configuration. Three Modulation and Coding Schemes (MCS) are presented: MCS7 (4QAM, Coding Rate (CR) ), MCS15 (16QAM, CR ) and MCS21 (64QAM, CR ). We use Extended Pedestrian A (EPA) channel model [17] with low correlation between channels and assume perfect channel knowledge at the receiver. 1000 channel realizations are used in simulation. Each realization has duration equal to one sub-frame. Transmitted data are distributed over 100 resource blocks and system bandwidth is 20 MHz (1200 sub-carriers). The number of iterations in turbo-decoder is 4. Transmission scheme assumes no automatic repeat request procedures.

Fig. 2.
figure 2

BLER Performance in \(8 \times 8\) LTE downlink, MCS7 (4QAM).

Fig. 3.
figure 3

BLER Performance in \(8 \times 8\) LTE downlink, MCS15 (16QAM).

Fig. 4.
figure 4

BLER Performance in \(8 \times 8\) LTE downlink, MCS21 (64QAM).

For comparison we provide results of few reference algorithms: MMSE, ML, MMSE OSIC soft output (using full covariance matrix (5)). ML algorithm was realized as QRD-M scheme [6] with very big number of symbol-candidates (parameter \(M=1000\)). It can be seen that proposed iterative procedure exceeds MMSE OSIC and both schemes outperform MMSE. There is expected gap with ML algorithm, but taking into account that proposed scheme has complexity comparable with MMSE and assumes parallel layer processing, it can be considered as an attractive performance-complexity trade-off.

6 Conclusion

In the paper the new iterative detection approach based on MMSE kernel and parallel interference cancellation is presented. Developed algorithm uses the principle of Turbo processing in its core. Unlike other known Turbo applications, however, we do not need to include decoder into the feedback loop. Additional information is obtained through non-linear processing in the demodulator. We show that proposed method demonstrates improved performance compared with conventional MMSE and MMSE based OSIC methods, while its complexity is only 2–2.5 times higher than MMSE.