Keywords

1 Introduction

Multiple-input multiple-output (MIMO) technology has been one of the most significant techniques of wireless communication system. The attractions of MIMO are the high performance gain and channel capacity which linear increase with the number of transmitter and receiver antennas due to the multipath parallel data transmission in the same frequency band [1]. The key to the implementation of MIMO system is the signal detection algorithm. Although many reliable signal detection methods for MIMO system have been proposed, it is still urgent for researchers to seek more efficient signal detection schemes to improve the current systems’ performances and adapt to the applications of massive MIMO and super massive MIMO systems in the future. Therefore, it is a hot topic of MIMO system that how to search the better signal detection algorithms balancing the relationship of complexity and performance of detection scheme.

There are many signal detection algorithms for MIMO system available, and maximum likelihood (ML) detector is the optimal detector which can minimize the bit-error-rate (BER) of detection but it has the highest computational complexity at the same time [2]. Zero-forcing (ZF) and minimum-mean-square-error (MMSE) algorithms are typical linear detection schemes with much lower complexities but less BER performances than ML [3]. Zero-forcing successive-interference-cancellation (ZF-SIC) [4] and minimum-mean-squared-error successive-interference-cancellation (MMSE-SIC) [5] are improved algorithms of ZF and MMSE respectively based on SIC method, they utilize the cancellation of the former detected symbol’s information to decrease BER. Radial basis function network optimized by quantum genetic algorithm (QGA-RBF) [6] and quantum ant colony algorithm (QACA) [7] are intelligent optimization algorithm that can shrink the searching region where the optimal solution exists, but the computational complexity will increase concurrently with respect to the population size of QGA. Though there are many signal detection algorithms, but few of them have applied machine learning methods to signal detection of MIMO system, and none of them regards the problem as pattern recognition or feature classification problem, while this kind of issue transformation might be necessary in the process of high dimensional received data of massive MIMO systems in the future.

Deep learning and machine learning algorithms have been successfully applied in feature extraction and classification tasks in decades, and extreme learning machine algorithm stands out from others in virtue of its fast training speed, good generalization and universal approximation capability [8, 9]. In this paper, a novel signal detection algorithm based on extreme learning machine auto encoder (ELM-AE) [10] has been proposed. The proposed algorithm consists of two parts; the feature extractor based on ELM-AE can obtain useful feature representations of input samples by separating the channel state information via the projection of the learnt connection weights parameters from the unsupervised learning of the ELM-AE, the trained ELM classifier is used to recognize the corresponding original symbols of the features. Due to the fast learning speed and high classification accuracy of ELM, simulation results indicate that the proposed algorithm is more efficient than ML, QGA-RBF and the performance outperforms ZF, MMSE, and ZF-SIC, MMSE-SIC algorithms.

The rest of this paper is organized as follows. Section 2 introduces the MIMO system model, and ELM, ELM-AE algorithms. The proposed signal detection algorithm is described in Sect. 3. Section 4 presents the simulation results. Finally, conclusions are drawn in Sect. 5.

2 MIMO System, ELM and ELM Auto-Encoder

2.1 MIMO System

This paper investigates a point-to-point MIMO system with N transmitter antennas and M receiver antennas, where N ≤ M. The structure of MIMO wireless system is presented as in Fig. 1.

Fig. 1.
figure 1

The block diagram of MIMO system. This system includes three parts: transmitter, channel and receiver. x are the transmitting signals, y are receiving symbols and H s represents channel state.

As shown in Fig. 1, the input-output relationship of MIMO system can be expressed in vector form as follows:

$$ y = H_{s} \cdot x + n. $$
(1)

where \( \varvec{y}{\mathbf{ \;=\; }}\left[ {\varvec{y}_{{\mathbf{1}}} {\mathbf{,}}\varvec{y}_{{\mathbf{2}}} {\mathbf{,}}\,{\mathbf{ \ldots ,}}\varvec{y}_{\varvec{M}} } \right]^{{\mathbf{T}}} \) is received signal vector, the corresponding transmitted vector \( \varvec{x}{\mathbf{ \;=\; }}\left[ {\varvec{x}_{{\mathbf{1}}} {\mathbf{,}}\varvec{x}_{{\mathbf{2}}} {\mathbf{,}}\,{\mathbf{ \ldots ,}}\varvec{x}_{\varvec{N}} } \right]^{{\mathbf{T}}} \), H s denotes a M × N channel state matrix and n is the additive white Gaussian noise with zero mean and variance of σ 2 [11].

In MIMO system the received symbol on every receiving antenna contains all information from transmitted symbols [12], as presented in Eq. (2).

$$ y_{k} = \sum\limits_{i = 1}^{N} {\left( {h_{k,i} \cdot x_{i} } \right)} + n_{k} $$
(2)

where h k,i denotes the channel gain from the i th transmitting antenna to the k th receiving antenna, n k represents the white Gaussian additive noise on k th receiving antenna.

Signal detection of MIMO system is to obtain the optimal solution which has the minimum difference compares to the source signal x from the received symbols y.

The maximum likelihood (ML) detection is an optimal detection algorithm; it searches the entire solution space Ф to find the optimal solution which can minimize the cost function

$$ \ddot{x} = \mathop {\arg \hbox{min} }\limits_{{x \in\Phi }} \left\{ {\left\| {y - H \cdot x} \right\|^{2} } \right\}. $$
(3)

On account of the high complexity, suboptimal detection scheme is required. Thus linear detection methods (ZF and MMSE) and nonlinear detection methods (QGA-RBF, QACA etc.) are designed based on optimal detection theory [7].

2.2 Original Extreme Learning Machine

In order to solve the single hidden layer neural network (SLFNs), Huang [8, 13] has proposed a novel fast learning algorithm called extreme learning machine (ELM). The original ELM is shown in Fig. 2.

Fig. 2.
figure 2

Structure of ELM with additive hidden nodes: a i is the total input of i th hidden neuron, dotted lines denote multi-classification case. x j is the j th input data, O ji represents the i th output.

The first layer’s parameters of ELM are randomly generated and does not need to be fine turned and the output weights are obtained by Eq. (8). Huang has proved that ELM has the same solution formula in binary classification case, multi-classification case and regression case [14], thus ELM has the generic form in the application of classification and regression as presents in Fig. 2. Original ELM model is described as follows:

  1. (1)

    Define parameters of ELM. Suppose { x k , t k | x k   R d , t k   R m , k = 1, …, N } is the training set, where x k is the k th training vector, t k represents the k th target output (label) of corresponding training sample, and d, m denote the dimension of training samples and labels respectively, N is the number of training samples. w ij is defined as the random connection weight between i th input neuron and j th hidden neuron, b j is the bias of j th hidden neuron, they are randomly generated based on Gaussian distribution, and g is a activation function of hidden layer, typically is given by users.

  2. (2)

    Calculate the output feature representation matrix H of hidden layer

    $$ H = \left[ {\begin{array}{*{20}c} {h_{1} \left( x \right)} \\ \vdots \\ {h_{L} \left( x \right)} \\ \end{array} } \right]^{T} = \left[ {\begin{array}{*{20}c} {h_{1} \left( {x_{1} } \right)} & \cdots & {h_{1} \left( {x_{N} } \right)} \\ \vdots & \ddots & \vdots \\ {h_{L} \left( {x_{1} } \right)} & \cdots & {h_{L} \left( {x_{N} } \right)} \\ \end{array} } \right]. $$
    (4)
    $$ {\text{Where}}\,\,\,\,h_{j} \left( {x_{n} } \right) = g\left( {\sum\limits_{i} {x_{n} \left( i \right) \cdot w_{i,j} } + b_{j} } \right). $$
    (5)

    and (i = 1, …, dj = 1, …, L, L is the number of hidden neurons) [15].

  3. (3)

    Calculate the output weights. The target of our training is to obtain a weight matrix β which satisfies the equation:

    $$ T = \beta \cdot \, H $$
    (6)

    where

    $$ T = \left[ {\begin{array}{*{20}c} {t_{1} } \\ \vdots \\ {t_{N} } \\ \end{array} } \right]^{T} = \left[ {\begin{array}{*{20}c} {t_{11} } & \cdots & {t_{1N} } \\ \vdots & \ddots & \vdots \\ {t_{m1} } & \cdots & {t_{mN} } \\ \end{array} } \right] $$
    (7)

    is the target matrix (labels). Then we can obtain the output weight matrix

    $$ \beta = T\cdot{\text{ H}}^{\dag } . $$
    (8)

    where T = [t 1 , …, t N ], H is the Moore-Penrose generalized inverse of matrix H. Typically we can calculate the MP inverse efficiently with the orthogonal projection method [15]: H = (H T H)−1 H T, for nonsingular case of H T H; if H T H is singular, H = H T (HH T)−1. According to [14], if a positive value C is added to the diagonal of H T H or HH T, the solution could be more stable and has better generalization performance based on the ridge regression theory [16]. Thus the modified β is

    $$ \beta = T \cdot H^{T} \left( {\frac{I}{C} + HH^{T} } \right)^{ - 1} . $$
    (9)
    $$ {\text{or}}\,\,\,\beta = T \cdot \left( {\frac{I}{C} + H^{T} H} \right)^{ - 1} H^{T} . $$
    (10)
  4. (4)

    Trained ELM for classification and regression etc. The trained output weights β and random connection weights w and biases b is our target parameters for application. If testing data set is \( \{ S_{\text{i}} \,|\,S_{\text{i}} \in R^{d} ,i = 1, \ldots ,N_{s} \} \), then the corresponding output of ELM is as follows while H T H is nonsingular.

    $$ f\left( S \right) = h\left( S \right) \cdot \beta = g\left( {w \cdot S + b} \right) \cdot TH^{T} \left( {\frac{I}{C} + HH^{T} } \right)^{ - 1} . $$
    (11)

2.3 ELM Auto-Encoder

Auto encoder (AE) is a representative unsupervised deep learning method, typically AE is used for feature extraction from unlabeled input data [17], and it can reduce the redundancies of input data. In addition a multilayer or a deep hierarchy’s structure can be built by stacking AEs on top of each other [18].

The ELM auto-encoder (ELM-AE) is a kind of auto encoder which is established based on the random projection and fast learning speed of ELM, it could be seen as a special case of ELM where the input is the target output meanwhile [10], the ELM-AE is consist of three layers as shown in Fig. 3.

Fig. 3.
figure 3

ELM based auto encoder consists of input, hidden and output layers, w and b are random connection weights and biases, a is the trained output connection weights.

The working process of ELM-AE is as the same as ELM as show in Fig. 2. There are connection weights w, a between adjacent layers and bias b in hidden layer.

The input weights w and bias b of ELM-AE are randomly generated as the same as ELM in this paper, thus as show in Fig. 3 the input data x is mapped to L-dimensional ELM random feature space first, then transformed to a more stable and generalized feature space by activation function g. The output weights a form a more stable and generalized projection of the input data than w via the unsupervised learning of ELM, therefore (a )T is used as the input weights instead of (w)T in ELM-AE as show in Fig. 4.

Fig. 4.
figure 4

The schematic diagram of the proposed detector based on ELM-AE for MIMO system

3 ELM Auto-Encoder for MIMO Signal Detection

The signal detection problem of MIMO system is regarded as a classification or pattern recognition problem in this paper, thus it is reasonable to resolve the problem with machine learning algorithms. In this paper, a novel detector for MIMO system based on unsupervised feature learning and classification via ELM based auto-encoder and ELM classifier is proposed, the schematic diagram is presented in Fig. 4. The detection algorithms is designed as follows:

  1. (1)

    Train ELM based auto encoder. ELM based auto-encoder is a special case of ELM, thus the training processing of ELM-AE is the same to ELM: First the input weights and biases of hidden layer [W, b] are randomly initialized based on Gaussian distribution, then the codes and the weights which are utilized to reconstruct input data can be obtained:

    $$ {\text{Random projection code}}: \boldsymbol{h} = g\left( \boldsymbol{W} \cdot \boldsymbol{x} + \boldsymbol{b} \right). $$
    (12)
    $$ {\text{The output weights}}:\, a' = \boldsymbol{x \cdot h^{T}} \left( {{\mathbf{I}}/{\text{C }} + \boldsymbol{h \cdot h}^{T} } \right)^{ - 1} . $$
    (13)
  2. (2)

    Feature projection. The output weight matrix a of ELM-AE is served as the output of ELM-AE, and its transposed matrix a is set as the input weight of feature projection layer of the proposed detector:

    $$ \boldsymbol{a} = \, \left( {\boldsymbol{a'}} \right)^{T} . $$
    (14)

    The output H of feature layer are the feature representations of input data, they are more stable and generalized than the random projection code of ELM-AE:

    $$ \boldsymbol{H} = g\left( {\boldsymbol{a}} \cdot \boldsymbol{x} + \boldsymbol{b} \right). $$
    (15)
  3. (3)

    Train ELM classifier. The next step as the dotted arrow shown in Fig. 4 is to train the original ELM classifier after the representations of input data x are obtained. The training sample of the ELM is H. The training label of H is the same to x, it’s the target output of our detector. The class label of x could be the corresponding transmitted signal of the received signals x of MIMO system or the class number (corresponds to the transmitted signal) of x, in this paper the class label of x is the later one.

  4. (4)

    Testing the detector. Feature extractor and classifier are obtained from the training above. Then the proposed detector is ready to extract the features of input samples and classify the features to corresponding classes which can reconstruct the information of the transmitted symbols.

4 Simulation and Performance Evaluation

Computer simulations are conducted to investigate the performance of the proposed ELM-AE detector. Simulations are based on a simplified 4 × 4 point to point BPSK modulated MIMO system. Suppose the channel state H is known.

4.1 Parameter Selection

In this section, some simulations have been conducted to search for the best parameters of the proposed detector.

Figure 5(a) shows the testing result of positive value C and the number of hidden neurons L. In this simulation C and L are set as {10−9, 10−8, …, 100, …, 107, 108, 109}, {10, 20, …, 440, 450} respectively. The ‘TestingRate’ represents the difference value of bit error rate (BER) of ML detector and the proposed detector. The results indicate that the BER performance could be better when C ranges form 10−3 to 101 and L is more than 100, in the next simulations L is set as 120, and C is set as 1/snr in this paper where snr represents signal to noise ratio.

Fig. 5.
figure 5

(a) The error rate performance of the proposed detector compare to ML detection at different point of (C, L) when Eb/N0 = 7; (b) The BER performance of different number of training data; (c) Activation function testing.

Figure 5(b) shows the BER performance of different number of training data. Based on the results the training samples is set as 2400 × 4 in the next experiments under the premise that the performance is guaranteed.

This paper selects and uses tanh as the activation function of ELM and ELM-AE, as show in Fig. 5(c), arc-tan function tanh outperforms sine function sin, hard limit function hardlim, and sigmoid function sig, while the hidden layer number is 1, hidden neurons L = 120, the positive value C = 1/snr.

4.2 Comparisons Between ELM-AE Detector and Other Detection Methods

In order to verify the good performance of the proposed ELM-AE detector, this section carried out the traditional detection algorithms ML, ZF, MMSE in Fig. 6(a), and several state-of-the-art detection algorithms such as ZF-SIC, MMSE-SIC, QGA-RBF for comparison in Fig. 6(b).

Fig. 6.
figure 6

(a) The performance curve of traditional algorithms (ML, ZF, MMSE) and the proposed method for MIMO signal detection; (b) The performance curve of several different state-of-the-art algorithms (ZF-SIC, MMSE-SIC, QGA-RBF) and the proposed method for MIMO signal detection.

Figure 6 and Table 1 illustrate the bit error rate, the mean detection time and the mean error rates compares to ML detector of these algorithms. It is event that the proposed detector outperforms the detector based on ZF, MMSE, and reaches a similar performance to the optimal detector ML from Fig. 6(a). Figure 6(b) indicates that the performance of the proposed detector outperforms ZF-SIC, MMSE-SIC, and exceeds QGA-RBF detection when SNR is more than 9 dB. The detection time of the proposed algorithm is 0.9402 s which is 20-times the detection speed of ML while detecting the symbols one by one, but the proposed algorithm supports the way of block detection and detection efficiency increase 4.4 times.

Table 1. The mean detection time and error rates to ML detector.

In addition, channel estimation is not required in the proposed algorithm while the channel sate is contained in the output connection weights learnt from the unsupervised learning of ELM-AE.

5 Conclusion

In this paper, the Extreme Learning Machine based auto encoder is applied to MIMO system for signal detection. The signal detection scheme for MIMO system based on ELM-AE is proposed, channel estimation is embedded in learnt connection weights of ELM-AE, and the simulations of detection performance are made and analyzed, the simulation results show that the proposed scheme not only has better performance than many linear and nonlinear detection schemes, but also reaches much lower complexity than these methods, and its performance is very close to optimal detection algorithm. Massive MIMO system is a wide concerned MIMO system, its channel capacity is much higher than normal MIMO system, while the detection complexity is much higher at the same time, which is a big obstacle for its application. In the future works we will going to apply this detection scheme to massive MIMO system, and searching the possibility of reducing the detection complexity.