Abstract
Although a lot of research has been done on speaker identification in the presence of noise and channel variation, to the best of our knowledge, no work has been reported for aeronautical applications. In this paper, we aim to fulfill this goal by developing a Speaker Identification System (SIS) for future aeronautical communications systems. Furthermore, we present a novel feature extraction scheme based on multi-resolution analysis. The proposed features called SMFCC use Mel Frequency Cepstral Coefficients (MFCCs) features of stationary wavelet transform sub-bands. The extracted features are modeled using the i-vector approach, and support-vector machines are adopted as a back-end classifier. The performance of the proposed SIS is evaluated using two publicly available databases. Comparison of the proposed approach with the baseline MFCC feature extraction shows the feasibility and the robustness of the proposed method. Besides the noise reduction, the identification accuracy is improved by about 12% at higher signal-to-noise ratios and reaches 97.33% as compared to 88.33% using MFCC for ATCOSIM database.
Similar content being viewed by others
References
S. Abd El-Moneim, M.I. Dessouky, F.E. Abd El-Samie, M.A. Nassar, M. Abd El-Naby, Hybrid speech enhancement with empirical mode decomposition and spectral subtraction for efficient speaker identification. Int. J. Speech Technol. 18(4), 555–564 (2015)
N. Asbai, A. Amrouche, M. Debyeche, Performances evaluation of GMM-UBM and GMM-SVM for speaker recognition in realistic world, in Neural Inf. Process., ed. by B.-L. Lu, L. Zhang, J. Kwok (Springer, Berlin, 2011), pp. 284–291
J.G.P. Bernal, A.P. Guerrero, J.G. Close, A speaker verification system using SVM over a Spanish corpus, in 2009 Mexican International Conference on Computer Science, Sept 2009, pp. 381–386
J.P. Campbell, Speaker recognition: a tutorial. Proc. IEEE 85(9), 1437–1462 (1997)
J.P. Campbell, D.A. Reynolds, Corpora for the evaluation of speaker recognition systems, in 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 2, pp. 829–832, March 1999
M. Carey, E. Parris, H. Lloyd-Thomas, S. Bennett, Robust prosodic features for speaker identification, in Proceedings of ICSLP-96, November 1996
P.M. Chauhan, N.P. Desai, Mel Frequency Cepstral Coefficients (MFCCs) based speaker identification in noisy environment using wiener filter, in 2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE), Mar 2014, pp. 1–5
S.-H. Chen, H.-C. Wang, Improvement of speaker recognition by combining residual and prosodic features with acoustic features, in 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, May 2004, pp. 93–96
B.J. Chua, X.J. Li, H.D. Tran, Study of automatic biosounds detection and classification using SVM and GMM, in 2011 IEEE/NIH Life Science Systems and Applications Workshop (LiSSA), April 2011, pp. 155–158
C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
S. Davis, P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust., Speech Signal Process. 28(4), 357–366 (1980)
N. Dehak, P.J. Kenny, R. Dehak, P. Dumouchel, P. Ouellet, Front-end factor analysis for speaker verification. IEEE Trans. Acoust., Speech Signal Process. 19(4), 788–798 (2011)
S. Dey, S. Barman, R.K. Bhukya, R.K. Das, B.C. Haris, S.R.M. Prasanna, R. Sinha, Speech biometric based attendance system, in 2014 Twentieth National Conference on Communications (NCC), Feb 2014, pp. 1–6
H. Ding, Z.-M. Tang, L.-H. Wei, Y.-P. Li, A study on speaker identification based on weighted LS-SVM. Autom. Control Comput. Sci. 43(6), 328–335 (2009)
M. Dutta, C. Patgiri, M. Sarma, K.K. Sarma, Closed-Set Text-Independent Speaker Identification System Using Multiple ANN Classifiers (Springer, Cham, 2015), pp. 377–385
M. Faundez-Zanuy, E. Monte-Moreno, State-of-the-art in speaker recognition. IEEE Aerosp. Electron. Syst. Mag. 20(5), 7–12 (2005)
J. Garofolo, L. Lamel, Fisher, J. Fiscus, D. Pallett, N. Dahlgren, V. Zue, Timit acoustic-phonetic continuous speech corpus. Linguistic Data Consortium (1993)
H. Gish, M. Schmidt, Text-independent speaker identification. IEEE Signal Process. Mag. 11(4), 18–32 (1994)
S.M. Govindan, P. Duraisamy, X. Yuan, Adaptive wavelet shrinkage for noise robust speaker recognition. Digit. Signal Process. 33, 180–190 (2014)
E. Haas, Aeronautical channel modeling. IEEE Trans. Veh. Technol. 51(2), 254–264 (2002)
M. Hagmüller, G. Kübin, Speech watermarking for air traffic control. Eurocontrol Experimental Centre, EEC Note 05/05 (2005)
M. Hébert, Text-Dependent Speaker Recognition (Springer, Berlin, 2008), pp. 743–762
HINDSIGHT, Nn\(^{\circ }\)2 communication. Technical report, EUROCONTROL (2006)
K. Hofbauer, H. Hering, G. Kübin, Speech watermarking for the VHF radio channel, in 4th EUROCONTROL Innovative Research Workshop, December 2005
K. Hofbauer, S. Petrik, H. Hering, The ATCOSIM corpus of non-prompted clean air traffic control speech, in Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), Marrakech, Morocco, May 2008
Y. Hu, P. Loizou, Subjective comparison and evaluation of speech enhancement algorithms. Speech Commun. 49, 588–601 (2007)
A.A.A.A. Khalil, E.S.M. Saad, M.A. El-Nabi, F.E.A. El-Samie, Efficient speaker identification from speech transmitted over Bluetooth based system, in 2013 8th International Conference on Computer Engineering Systems (ICCES), Nov 2013, pp. 190–193
W.B. Kheder, D. Matrouf, P.-M. Bousquet, J.-F. Bonastre, M. Ajili, Fast i-vector denoising using map estimation and a noise distributions database for robust speaker recognition. Comput. Speech Lang. 45, 104–122 (2017)
S.G. Koolagudi, K. Sreenivasa Rao, R. Reddy, V.A. Kumar, S. Chakrabarti, Robust Speaker Recognition in Noisy Environments: Using Dynamics of Speaker-Specific Prosody (Springer, New York, 2012), pp. 183–204
K.A. Lee, A. Larcher, H. Thai, B. Ma, H. Li, Joint application of speech and speaker recognition for automation and security in smart home, in INTERSPEECH, 2011, pp. 3317–3318
K.A. Lee, B. Ma, H. Li, Speaker verification makes its debut in smartphone. IEEE Signal Processing Society Speech and Language Technical Committee Newsletter (2013)
Y. Lei, L. Burget, N. Scheffer, A noise robust i-vector extractor using vector Taylor series for speaker recognition, in2013 IEEE international conference on acoustics, speech and signal processing, May 2013, pp. 6788–6791
B.G. Nagaraja, H.S. Jayanna, Multilingual Speaker Identification with the Constraint of Limited Data Using Multitaper MFCC (Springer, Berlin, 2012), pp. 127–134
M. Neffe, V. Pham, H. Horst, G. Kubin, Speaker segmentation for air traffic control, in Lecture Notes in Artificial Intelligence (Springer, Berlin, 2007), pp. 177–191
NIST. The NIST year 2002 speaker recognition evaluation plan. National Institute of Standards and Technology of USA, February (2002). Available: http://www.nist.gov/speech/tests/spk/2002/doc/2002-spkrec-evalplan-v60.pdf
S. Qureshi, I. Masood, M. Hashmi, S. Hanninen, M. Sarwar, A. Jameel, Noise reduction of electrocardiographic signals using wavelet transforms. Elektron. Elektrotech. 20(3), 29–32 (2014)
D. Reynolds, W. Andrews, J. Campbell, J. Navratil, B. Peskin, A. Adami, Q. Jin, D. Klusacek, J. Abramson, R. Mihaescu, J. Godfrey, D. Jones, B. Xiang, The Supersid Project: exploiting high-level information for high-accuracy speaker recognition, in 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP ’03), vol. 4, Apr 2003, pp. IV–784
D.A. Reynolds, An overview of automatic speaker recognition technology, in 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, May 2002, pp. IV–4072–IV–4075
D.A. Reynolds, R.C. Rose, Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)
S. Sadjadi, J. Hansen, Mean Hilbert envelope coefficients (MHEC) for robust speaker and language identification. Speech Commun. 72, 138–148 (2015)
M. Sajatovic, et al., L-DACS1 system definition proposal: ddeliverable D2. Technical report version 1.0, Feb 2009
S. Sekkate, M. Khalil, A. Adib, An improved automatic aircraft identification system, in 2016 International Conference on Wireless Networks and Mobile Communications (WINCOM), Oct 2016, pp. 47–51
S. Sekkate, M. Khalil, A. Adib, Speaker identification: a way to reduce call-sign confusion events, in 2017 International Conference on Advanced Technologies for Signal & Image Processing, May 2017
U. Seljuq, F. Himayun, H. Rasheed, Selection of an optimal mother wavelet basis function for ECG signal denoising, in 17th IEEE International Multi Topic Conference 2014, Dec 2014, pp. 26–30
S. Selva Nidhyananthan, R. Shantha Selva Kumari, T. Senthur Selvi, Noise robust speaker identification using RASTA–MFCC feature with quadrilateral filter bank structure. Wirel. Pers. Commun. 91(3), 1321–1333 (2016)
A. Shafik, S.M. Elhalafawy, S.E.M. Diab, B.M. Sallam, F.E.A. El-Samie, A wavelet based approach for speaker identification from degraded speech. IJCNIS 1(3), 52–58 (2009)
Y. Shao, S. Srinivasan, D. Wang, Incorporating auditory feature uncertainties in robust speaker identification, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2007, Honolulu, HI, USA, 15–20 Apr 2007, pp. 277–280
D. Snyder, D. Garcia-Romero, G. Sell, D. Povey, S. Khudanpur, X-vectors: robust DNN embeddings for speaker recognition, in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2018), pp. 5329–5333
M. Stark, T.V. Pham, F. Pernkopf, G. Kubin, H. Hering, Speaker verification for air traffic control, in EUROCONTROL Innovative Research Workshop and Exhibition, Dec 2006
V.N. Vapnik, Statistical Learning Theory. Adaptive and Learning Systems for Signal Processing, Communications, and Control (Wiley, New York, 1998)
Voxforge database. Technical report
A. Vuppala, K.S. Rao, Speaker identification under background noise using features extracted from steady vowel regions. Int. J. Adapt Control Signal Process 27(9), 781–792 (2013)
L. Xu, R.K. Das, E. Yilmaz, J. Yang, H. Li, Generative x-vectors for text-independent speaker verification (2018). CoRR, arXiv:1809.06798
X. Zhao, Y. Shao, D. Wang, Casa-based robust speaker identification. IEEE Trans. Audio Speech Lang. Process. 20(5), 1608–1616 (2012)
X. Zhao, Y. Wang, D. Wang, Robust speaker identification in noisy and reverberant conditions. IEEE Trans. Audio Speech Lang. Process. 22(4), 836–845 (2014)
T.F. Zheng, Q. Jin, L. Li, J. Wang, F. Bie, An overview of robustness related issues in speaker recognition, inSignal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific, Dec 2014, pp. 1–10
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sekkate, S., Khalil, M. & Adib, A. Speaker Identification for OFDM-Based Aeronautical Communication System. Circuits Syst Signal Process 38, 3743–3761 (2019). https://doi.org/10.1007/s00034-019-01026-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-019-01026-z