Keywords

1 Introduction

With the rising popularity of software defined radios (SDR), there is a strong demand for automatic detection of the modulation type and the signal parameters. SDR is a radio communication system where the traditional radio components (e.g. amplifiers, mixers, filters, demodulators, etc.) were replaced by means of software. Such a design produces a system which can receive and transmit different radio modulations based solely on the software’s specification. Nowadays, SDR is becoming the dominant technology in radio communications.

The concept of the software defined radio has been extensively studied since the late nineties. From the beginning there has been a strong demand from the military for software defined radios capable of intercepting and decoding various radio transmissions. The ability to automatically detect the modulation type is crucial for monitoring wireless communication but also for effective signal jamming.

Automatic Modulation Classification (AMC) is an approach to identify the modulation type and its parameters such as the carrier frequency or symbol rate. AMC systems were introduced in military and in a short time appeared in commercial applications. In military applications, involving electronic warfare, it enables real-time signal interception and processing. In civil applications it can be used, e.g., by the amateur radio operators to automatically set the transceiver to the appropriate modulation and communication protocol.

The automatic modulation detection algorithms are crucial parts of adaptive modulation techniques which are nowadays used in almost all wireless mobile cellular systems. Assuming the transmitter has the capability to choose the modulation (based on the changing environment and detected interference) it can significantly influence the quality of ongoing transmission. In such cases, the receiver must be able to detect the dynamically changing modulation [15, 16].

2 State of the Art

The AMC approaches can be divided into three main classes: statistical methods, decision theoretic methods and feature based methods. First AMC techniques were based on calculating the time domain parameters like amplitude, instantaneous frequency or phase shift of the signal [5, 8]. The subsequent signal classification can be based on an automatic histogram separation method [8]. The histogram is calculated using the instantaneous phase estimation. Other techniques rely on a time domain analysis, calculating envelope characteristics of the signal [2].

The majority of the decision theoretic methods are based on the assumption that the Likelihood functional of the observed waveform provides all necessary information for signal detection and estimation of transmission parameters and modulation. In these approaches, the classification problem is formulated as a multiple-hypothesis problem, where modulation is selected from a candidate list based on maximizing the likelihood radio [6].

The statistical signal approaches exploits various properties of the signal to gain information about the analysed signal. More advanced techniques employ higher order correlation, cumulant based methods or higher order statistics [4]. One of the biggest disadvantages of the statistic approaches is their lower resistance to noise [1].

Neural networks approaches involve experiments with different configurations, learning techniques and selection of optimal input parameters. The popular approach is to involve a multi-layer perceptron network [10, 14], however there is a possibility that the training procedure will get stuck due to a local optimum of the cost function. This can be avoided by using a radial basis function network [9].

A more detailed description of algorithms can be found in [11, 16] and a recent survey of AMC approaches was published in [13].

The herein proposed algorithm combines the statistical driven approaches with neural networks. We will explore two neural networks mentioned in this section, specifically radial basis function network and feed forward network.

3 Signal Model

Suppose the incoming waveform r(t) containing signal s(t) with an additive white Gaussian noise n(t)

$$\begin{aligned} r(t) = s(t) + n(t). \end{aligned}$$
(1)

The modulation changes a sine wave s(t) to encode information. The equation representing the sine way can be written as:

$$\begin{aligned} s(t) = \sqrt{2 S a(t)}\cos (\omega t + \theta (t) + \varPhi _K), \end{aligned}$$
(2)

where S is the power of the signal, a(t) is its amplitude, \(\omega \) denotes the carrier frequency, \(\theta (t)\) carrier phase and \(\varPhi _K\) is the modulation phase. For phase-shift keying (PSK) modulation, \(\varPhi _K\) is a value from the set of complex phases \(\lbrace \varPhi _1, \varPhi _2, \varPhi _3, \ldots \rbrace \), denoted as the constellation of the modulation. The aim of the automatic modulation classifier is to extract features such as carrier frequency, phase offsets, I/Q origin offset in the constellation plane, etc., to classify the type and order of modulation in signals from noisy environments.

4 Selection of Statistic Features

The selection of the proper statistic features plays the key role in the success of the proposed automatic modulation classification algorithm. We need to select those statistical features that are most selective for different aspects of the processed signal. It is also necessary to maintain the diversity of the selected features in order to be able to capture all possible variations in modulations.

The first discussed feature \(\gamma _{max}\) depends on the variation of the amplitude. It is defined as a maximum value of the spectral power density of the normalized and centered instantaneous amplitude:

$$\begin{aligned} \gamma _{max}= & {} \frac{max|DFT(A_{cn})|^2}{N}, \qquad A_{cn}[n] = \frac{A[n]}{\mu _{A}} - 1, \end{aligned}$$

where A is the vector of the instantaneous amplitude and \(\mu _{A}\) is the arithmetic mean of the instantaneous amplitude [16].

The maximum value of the spectral power density \(\gamma _{max}\) provides a significant differentiation characteristic between two groups of radio transmissions - the one that uses an amplitude to encode the data and the second that uses a different means (e.g. frequency changes). For the first group (AM and ASK) this parameter gives significantly higher values than for the other. On the borderline of these two groups we can find the QAM modulations which combine both ASK and PSK features. For the modulations based on the frequency changes the value of \(\gamma _{max}\) is almost zero.

The standard deviation of the absolute value of the normalized and centered instantaneous amplitude (\(\sigma _{aa}\)) should provide a similar discrimination as \(\gamma _{max}\). The value of \(\sigma _{aa}\) is given by

$$\begin{aligned} \sigma _{aa}= & {} \sqrt{\frac{1}{N} \left( \sum \limits _{n = 1}^{N}A_{cn}^2[n]\right) - \left( \frac{1}{N} \sum \limits _{n = 1}^{N}|A_{cn}[n]|\right) ^2}, \end{aligned}$$
(3)

where N is the number of complex samples and \(A_{cn}[n]\) is a vector of normalized centered instantaneous amplitude [3, 11, 16].

The ability to differentiate between the amplitude and frequency modulations is almost the same as \(\gamma _{max}\). As our model does not detect the number of states in the digital communication, \(\sigma _{aa}\) may be considered redundant. Because both features are amplitude dependent it is possible that the neural network might have a problems to discriminate AM and ASK, or possibly ASK, PSK and QAM.

The standard deviation of the absolute value of the non-linear component of the instantaneous phase can be used to detect phase changes. It separates the hypothetical modulation space into two groups, one represented by FM, FSK, PSK, QAM and second by AM, ASK, BPSK modulations. Feature \(\sigma _{ap}\) should separate QAM modulations, but our experiments showed that it classifies BPSK modulations in the same way. The standard deviation of the absolute value of the non-linear component is defined as

$$\begin{aligned} \sigma _{ap}= & {} \sqrt{\frac{1}{N_c} \left( \sum \limits _{A_n[n]> A_t}^{}\phi _{NL}^2[n]\right) - \left( \frac{1}{N_c} \sum \limits _{A_n[n] > A_t}^{}|\phi _{NL}[n]|\right) ^2}, \end{aligned}$$
(4)

where \(N_c\) is the number of complex samples above the noise level and \(\phi _{NL}[n]\) is a vector of non-linear component of instantaneous phase [3, 11].

The variance of the phase deviation \(\sigma _{dp}\) helps to distinguish BPSK from other modulations not employing the phase changes. It gives additional information to \(\sigma _{ap}\). The standard deviation of the non-linear component of the direct instantaneous phase is calculated as

$$\begin{aligned} \sigma _{dp}= & {} \sqrt{\frac{1}{N_c} \left( \sum \limits _{A_n[n]> A_t}^{}\phi _{NL}^2[n]\right) - \left( \frac{1}{N_c} \sum \limits _{A_n[n] > A_t}^{}\phi _{NL}[n]\right) ^2}. \end{aligned}$$
(5)

Both characteristics \(\sigma _{ap}\) and \(\sigma _{dp}\) are suitable for separating digital modulations with phase changes from the analogous frequency modulations including analog phase modulation.

The standard deviation of the absolute value of the normalized and centered instantaneous frequency \(\sigma _{af}\) is sensitive to frequency changes and is primary used to differentiate between the binary and multi-state phase modulations. The formula for calculating \(\sigma _{af}\) is as follows

$$\begin{aligned} \sigma _{af}= & {} \sqrt{\frac{1}{N_c} \left( \sum \limits _{A_n[n]> A_t}^{}f_{N}^2[n]\right) - \left( \frac{1}{N_c} \sum \limits _{A_n[n] > A_t}^{}|f_{N}[n]|\right) ^2}\\ f_N[n]= & {} \frac{f_m[n]}{f_s}, \qquad f_m[n] = f[n] - \mu _f \nonumber , \end{aligned}$$
(6)

where \(N_c\) is the number of complex samples above the noise level, f[m] is the vector of instantaneous frequency and \(\mu _f\) is the arithmetic average of instantaneous frequency [3, 11, 16]. Experiments showed that this feature can be also useful for recognizing PSK and QAM, however it is not suitable for classifying analog frequency modulations and phase modulations.

The final feature we have used as an input to the neural network is the spectrum symmetry around the carrier frequency. The symmetry is calculated according the formulas

$$\begin{aligned} P= & {} \frac{P_L - P_U}{P_L + P_U}, \qquad f_{cn} = \frac{f_{cn}N}{f_s} - 1, \\ P_L= & {} \sum \limits _{n = 1}^{f_{cn}}|X_c[n]|^2, \qquad P_U = \sum \limits _{n = 1}^{f_{cn}}|X_c[n + f_{cn} + 1]|^2 \end{aligned}$$

where \(f_s\) is the sampling rate, \(f_c\) is the carrier frequency and \((f_{cn} + 1)\) is the sample position corresponding to carrier frequency.

For symmetric modulations as ASK, AM or QAM this value is close to zero, however for asymmetric FM or PM modulations it shows how the energy is shifted in frequency spectrum based on the transmitted information.

5 Neural Network Classification

The classification process is based on the neural network implemented in Encog library [7]. The input vector is composed of a set of characteristic features as introduced in the previous section. In the proposed detector we use two types of neural networks. The first one is the most commonly used type, Feed Forward Network (FFN). In this network information moves in only one direction, from the input nodes, through the hidden nodes to the output nodes, without loops or cycles. The second type is more sophisticated Radial Basis Function Network (RBN). The output of this network is a linear combination of radial basis functions of the inputs and neuron parameters. In our application, the RBN network should be able to handle the noisy signal better than the simpler Feed Forward Network [12].

For both models we have used supervised learning method based on a backpropagation. The training was performed on fully artificial data set for a given noise to signal ratio. The goal was to learn neural network to distinguish between seven types of different modulation (AM, FM, PM and ASK, FSK, PSK, QAM). Each modulation is represented by one class of output layer. The training set was composed of approximate 10 000 input vectors for each modulation. Samples with different modulations were created in the GRC framework. The total size of the set is more than 11.5 GB of binary data.

The essential parameter of the input vector is the window size, determining the number of complex samples used for the features calculation. The longer the time period is (by increasing the window size), the more reliable information we obtain from the statistical evaluation. For our experiments we have used the window size set to 16384 complex samples. The input samples were normalized before the training phase.

The Encog library provides the option for a cross-validation of the trained model. For this step we have divided our generated data set into two parts, 70 % were used for the training phase and 30 % for the validation step. The number of repetition was set to 5. The cross-validation should assess how the results of a statistical analysis will generalize to an independent data set. For the RBN network we have obtained the validation error equal to 9.79 %, while the Feed Forward Network had the validation error equal to 21.75 %.

6 Implementation Details

The implementation of the proposed algorithm was done in C# and comprised highly optimized signal processing algorithms from the GNU Radio Companion (GRC) libraries. The flow chart of the algorithm is depicted in Fig. 1.

The application loads into the memory the deserialized model of the neural network and the content of the input file containing the I/Q data. If the filtering is enabled, the signal is passed through the band pass filter. Data are loaded into the processing window and the DFT is calculated. The DFT is used to detect the signal presence and obtain basic signal features. The calculation of the characteristic signal features (as described in Sect. 4) is performed in parallel and the results are passed into the neural network (see Sect. 5). Classification results are printed on the console and saved into the file for later analysis.

Fig. 1.
figure 1

Flow chart of the proposed algorithm

7 Learning Phase and Validation

During the first phase we have evaluated the success rate of the proposed classification algorithm on the artificial data set. The signals used for the evaluation were different from the one used during the learning phase. The main parameters and carrier frequencies varied across the samples. The tests were performed on signals with different signal-to-noise ratio (SNR). The SNR ratio varied from the complete absence of the noise to the level of 15 dB. The noise was added to the originally generated noiseless samples. To simplify the evaluation, we have divided the testing data into three groups with different SNR ratio ranging from 15 dB to 45 dB with 10 dB step. The last tested group (the fourth) contains the samples without the added noise.

The tests in the second phase were performed on the real samples received by SDR. The SNR values of the received signal was determined by GRC module Frequency Xlating FIR Filter and low pass filter coefficients.

8 Experimental Results

Theoretically, the success rate of the classification performed by FFN and RBN on the synthetic data set can be estimated by the cross-error estimation implemented in the Encog library. According to these estimations we may expect better performance of the RBN, since it had more than 90 % success rate during the learning phase. In the following sections we will evaluate the success rate of both neural networks on the artificial data set and the real world data set composed of signals received by the SDR receiver.

8.1 Synthetic Data Set

The results of the RBN classification of the synthetic noise free signals is summarised in Table 1. From the first glance, we can see that the neural network tends to classify most signals as the ASK modulation. If we look closer, we can see that this trend is characteristic for related modulations – mainly AM, but also QAM and PSK, which is understandable. The detection of frequency modulations is however less accurate. FM and PM modulations are well detected; nevertheless the FSK modulation was not detected at all. It seems that the proposed RBN network is unable to classify this type of modulation.

Table 1. RBN classification accuracy (%) of pure signals without noise
Table 2. FFN classification accuracy (%) of pure signals without noise

Much better results were obtained from the FFN classification. Table 2 presents the results of FFN on the same data set. We can see (in contrast to the RBN) that the FFN is able to correctly classify frequency and amplitude modulations. This model provides better classification results for FSK and PSK. The errors of misclassification PSK and QAM modulations are expected, for the same reason as already mentioned.

Table 3 shows the behaviour of the RBN classification performed on signals with added white noise. The success rate of the detection of amplitude modulations increased, but for remaining modulations the performance decreased. Small improvements are visible in the classification of FFN in Table 4. Added noise slightly improved the performance since it helped generalize inputs. The misclassification errors remained with the related modulations.

With increasing noise level we can see further degradation of RBN classifications. Table 5 illustrates this observation. The success rate for the majority of modulations fell to around 10 % and misclassifications are appearing even between non-related modulations. Such results indicate that RBN network will not be suitable for real world samples as they would contain a significant amount of noise.

In contrast, FFN retained good classification performance even in the presence of noise. Table 6 summarize the classification accuracy of the signals with SNR values of 15–25 dB. The model can still reliable detect the main classes of the modulations and in most cases correctly classifies the exact modulation type. Additionally, the misclassification happens only between the related modulations.

Based on the results on the synthetic data set, we have decided to perform the experiments on the real world signals only with the FFN model.

Table 3. RBN classification accuracy (%) of signal with noise (SNR 35–45 dB)
Table 4. FFN classification accuracy (%) of signal with noise (SNR 35–45 dB)
Table 5. RBN classification accuracy (%) of signal with noise (SNR 15–25 dB)
Table 6. FFN classification accuracy (%) of signal with noise (SNR 15–25 dB)

8.2 Real World Data Set

The real world signals were captured by SDB522RT. It is a small USB DVB-T receiver compatible with RTL-SDR library. Similar DVB-T dongles based on the Realtek RTL2832U can be used as a cheap SDR and are becoming extremely popular among electronic enthusiast. The highest theoretically possible sample-rate is 3.2 MS/s in frequency range 52–2200 MHz with a gap from 1100 MHz to 1250 MHz. As a source of FM modulated signals we have used local radio stations operating in the radio band from 88 to 108 MHz. The ASK and PSK modulations were found in signals transmitted in range from 300 MHz to 320 MHz. QAM and QPSK modulations are used in digital transmission of private security and public safety agencies around 391 MHz. The FSK modulation was not found in any received signals. The AM modulation was captured from local radio stations operating in longwave and medium wave bands. Since the SDR is unable to receive signals below 52 MHz, an RF upconverter (Ham It Up v1.3 with 125 MHz oscillator) was used to shift signal into the receiving range of SDR. Table 7 presents the results of classification of these signals.

The AM radio transmissions were classified correctly only in 1 %, however related ASK and QAM modulations were detected in 88 % and 11 %. If we consider ASK as a special type of AM modulation with a finite number of possible amplitude values, this error is understandable. FM modulation was similarly misclassified as FSK (which is related to the FM in similar way, by using only a finite number of frequency changes). ASK was misclassified as AM mainly due to low SNR, as this classification is sensitive to noise. PSK modulation was classified in 69 % occurrences correctly. QAM modulation was classified in 46 % occurrences as ASK and 54 % PSK. This is natural as QAM is the combination of ASK and PSK and the general detection is hard.

Generally we can see that the success rate is lower than the values obtained from the synthetic data set. We have analysed the reasons and found that some parameters had significantly different values than the values obtained from the synthetic data sets. This was true especially of \(\gamma _{max}\), where the real world FM modulated signals generated values ten times lower than the one used during the learning phase. The similar behaviour was observed with \(\sigma _{aa}\). We suspect that these two significant differences degraded the classification abilities of the used neural network.

Table 7. FFN classification accuracy (%) of samples captured by SDR receiver

9 Conclusion

In this paper we have presented an automatic modulation classification based on the neural network. The proposed approach is based on calculating specific statistic features of the input signal that are used as an input to the neural network. The selection of the proper statistic features was discussed in more detail. Two types of neural networks were evaluated, the Radial Basis Function Network and Feed Forward Network. Both networks were learned using a synthetic data set and then evaluated on both synthetic and real world samples.

The experiments showed that the Feed Forward Network outperformed Radial Basis Function Network, especially in cases, where signal contained higher level of noise. For real world samples the classification success rate was lower, mainly due to the different characteristics of signals. Several modulations captured by SDR receiver had significantly different key characteristics than the signals we have artificially created for the learning phase.

What we see as a next logical step is to create a learning dataset based on the real world samples captured by the SDR. This task will be laborious but is necessary as there is a lack of large generally available data set of this kind. By doing this we can provide a better set for learning neural network. In addition to this, we should reconsider the selection of statistic features as some features did not provide enough information for unambiguous classification.