Skip to main content
Log in

Speaker Identification for OFDM-Based Aeronautical Communication System

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

Although a lot of research has been done on speaker identification in the presence of noise and channel variation, to the best of our knowledge, no work has been reported for aeronautical applications. In this paper, we aim to fulfill this goal by developing a Speaker Identification System (SIS) for future aeronautical communications systems. Furthermore, we present a novel feature extraction scheme based on multi-resolution analysis. The proposed features called SMFCC use Mel Frequency Cepstral Coefficients (MFCCs) features of stationary wavelet transform sub-bands. The extracted features are modeled using the i-vector approach, and support-vector machines are adopted as a back-end classifier. The performance of the proposed SIS is evaluated using two publicly available databases. Comparison of the proposed approach with the baseline MFCC feature extraction shows the feasibility and the robustness of the proposed method. Besides the noise reduction, the identification accuracy is improved by about 12% at higher signal-to-noise ratios and reaches 97.33% as compared to 88.33% using MFCC for ATCOSIM database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. S. Abd El-Moneim, M.I. Dessouky, F.E. Abd El-Samie, M.A. Nassar, M. Abd El-Naby, Hybrid speech enhancement with empirical mode decomposition and spectral subtraction for efficient speaker identification. Int. J. Speech Technol. 18(4), 555–564 (2015)

    Article  Google Scholar 

  2. N. Asbai, A. Amrouche, M. Debyeche, Performances evaluation of GMM-UBM and GMM-SVM for speaker recognition in realistic world, in Neural Inf. Process., ed. by B.-L. Lu, L. Zhang, J. Kwok (Springer, Berlin, 2011), pp. 284–291

    Chapter  Google Scholar 

  3. J.G.P. Bernal, A.P. Guerrero, J.G. Close, A speaker verification system using SVM over a Spanish corpus, in 2009 Mexican International Conference on Computer Science, Sept 2009, pp. 381–386

  4. J.P. Campbell, Speaker recognition: a tutorial. Proc. IEEE 85(9), 1437–1462 (1997)

    Article  Google Scholar 

  5. J.P. Campbell, D.A. Reynolds, Corpora for the evaluation of speaker recognition systems, in 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 2, pp. 829–832, March 1999

  6. M. Carey, E. Parris, H. Lloyd-Thomas, S. Bennett, Robust prosodic features for speaker identification, in Proceedings of ICSLP-96, November 1996

  7. P.M. Chauhan, N.P. Desai, Mel Frequency Cepstral Coefficients (MFCCs) based speaker identification in noisy environment using wiener filter, in 2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE), Mar 2014, pp. 1–5

  8. S.-H. Chen, H.-C. Wang, Improvement of speaker recognition by combining residual and prosodic features with acoustic features, in 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, May 2004, pp. 93–96

  9. B.J. Chua, X.J. Li, H.D. Tran, Study of automatic biosounds detection and classification using SVM and GMM, in 2011 IEEE/NIH Life Science Systems and Applications Workshop (LiSSA), April 2011, pp. 155–158

  10. C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  11. S. Davis, P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust., Speech Signal Process. 28(4), 357–366 (1980)

    Article  Google Scholar 

  12. N. Dehak, P.J. Kenny, R. Dehak, P. Dumouchel, P. Ouellet, Front-end factor analysis for speaker verification. IEEE Trans. Acoust., Speech Signal Process. 19(4), 788–798 (2011)

    Google Scholar 

  13. S. Dey, S. Barman, R.K. Bhukya, R.K. Das, B.C. Haris, S.R.M. Prasanna, R. Sinha, Speech biometric based attendance system, in 2014 Twentieth National Conference on Communications (NCC), Feb 2014, pp. 1–6

  14. H. Ding, Z.-M. Tang, L.-H. Wei, Y.-P. Li, A study on speaker identification based on weighted LS-SVM. Autom. Control Comput. Sci. 43(6), 328–335 (2009)

    Article  Google Scholar 

  15. M. Dutta, C. Patgiri, M. Sarma, K.K. Sarma, Closed-Set Text-Independent Speaker Identification System Using Multiple ANN Classifiers (Springer, Cham, 2015), pp. 377–385

    Google Scholar 

  16. M. Faundez-Zanuy, E. Monte-Moreno, State-of-the-art in speaker recognition. IEEE Aerosp. Electron. Syst. Mag. 20(5), 7–12 (2005)

    Article  Google Scholar 

  17. J. Garofolo, L. Lamel, Fisher, J. Fiscus, D. Pallett, N. Dahlgren, V. Zue, Timit acoustic-phonetic continuous speech corpus. Linguistic Data Consortium (1993)

  18. H. Gish, M. Schmidt, Text-independent speaker identification. IEEE Signal Process. Mag. 11(4), 18–32 (1994)

    Article  Google Scholar 

  19. S.M. Govindan, P. Duraisamy, X. Yuan, Adaptive wavelet shrinkage for noise robust speaker recognition. Digit. Signal Process. 33, 180–190 (2014)

    Article  Google Scholar 

  20. E. Haas, Aeronautical channel modeling. IEEE Trans. Veh. Technol. 51(2), 254–264 (2002)

    Article  MathSciNet  Google Scholar 

  21. M. Hagmüller, G. Kübin, Speech watermarking for air traffic control. Eurocontrol Experimental Centre, EEC Note 05/05 (2005)

  22. M. Hébert, Text-Dependent Speaker Recognition (Springer, Berlin, 2008), pp. 743–762

    Google Scholar 

  23. HINDSIGHT, Nn\(^{\circ }\)2 communication. Technical report, EUROCONTROL (2006)

  24. K. Hofbauer, H. Hering, G. Kübin, Speech watermarking for the VHF radio channel, in 4th EUROCONTROL Innovative Research Workshop, December 2005

  25. K. Hofbauer, S. Petrik, H. Hering, The ATCOSIM corpus of non-prompted clean air traffic control speech, in Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), Marrakech, Morocco, May 2008

  26. Y. Hu, P. Loizou, Subjective comparison and evaluation of speech enhancement algorithms. Speech Commun. 49, 588–601 (2007)

    Article  Google Scholar 

  27. A.A.A.A. Khalil, E.S.M. Saad, M.A. El-Nabi, F.E.A. El-Samie, Efficient speaker identification from speech transmitted over Bluetooth based system, in 2013 8th International Conference on Computer Engineering Systems (ICCES), Nov 2013, pp. 190–193

  28. W.B. Kheder, D. Matrouf, P.-M. Bousquet, J.-F. Bonastre, M. Ajili, Fast i-vector denoising using map estimation and a noise distributions database for robust speaker recognition. Comput. Speech Lang. 45, 104–122 (2017)

    Article  Google Scholar 

  29. S.G. Koolagudi, K. Sreenivasa Rao, R. Reddy, V.A. Kumar, S. Chakrabarti, Robust Speaker Recognition in Noisy Environments: Using Dynamics of Speaker-Specific Prosody (Springer, New York, 2012), pp. 183–204

    Google Scholar 

  30. K.A. Lee, A. Larcher, H. Thai, B. Ma, H. Li, Joint application of speech and speaker recognition for automation and security in smart home, in INTERSPEECH, 2011, pp. 3317–3318

  31. K.A. Lee, B. Ma, H. Li, Speaker verification makes its debut in smartphone. IEEE Signal Processing Society Speech and Language Technical Committee Newsletter (2013)

  32. Y. Lei, L. Burget, N. Scheffer, A noise robust i-vector extractor using vector Taylor series for speaker recognition, in2013 IEEE international conference on acoustics, speech and signal processing, May 2013, pp. 6788–6791

  33. B.G. Nagaraja, H.S. Jayanna, Multilingual Speaker Identification with the Constraint of Limited Data Using Multitaper MFCC (Springer, Berlin, 2012), pp. 127–134

    Google Scholar 

  34. M. Neffe, V. Pham, H. Horst, G. Kubin, Speaker segmentation for air traffic control, in Lecture Notes in Artificial Intelligence (Springer, Berlin, 2007), pp. 177–191

  35. NIST. The NIST year 2002 speaker recognition evaluation plan. National Institute of Standards and Technology of USA, February (2002). Available: http://www.nist.gov/speech/tests/spk/2002/doc/2002-spkrec-evalplan-v60.pdf

  36. S. Qureshi, I. Masood, M. Hashmi, S. Hanninen, M. Sarwar, A. Jameel, Noise reduction of electrocardiographic signals using wavelet transforms. Elektron. Elektrotech. 20(3), 29–32 (2014)

    Google Scholar 

  37. D. Reynolds, W. Andrews, J. Campbell, J. Navratil, B. Peskin, A. Adami, Q. Jin, D. Klusacek, J. Abramson, R. Mihaescu, J. Godfrey, D. Jones, B. Xiang, The Supersid Project: exploiting high-level information for high-accuracy speaker recognition, in 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP ’03), vol. 4, Apr 2003, pp. IV–784

  38. D.A. Reynolds, An overview of automatic speaker recognition technology, in 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, May 2002, pp. IV–4072–IV–4075

  39. D.A. Reynolds, R.C. Rose, Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)

    Article  Google Scholar 

  40. S. Sadjadi, J. Hansen, Mean Hilbert envelope coefficients (MHEC) for robust speaker and language identification. Speech Commun. 72, 138–148 (2015)

    Article  Google Scholar 

  41. M. Sajatovic, et al., L-DACS1 system definition proposal: ddeliverable D2. Technical report version 1.0, Feb 2009

  42. S. Sekkate, M. Khalil, A. Adib, An improved automatic aircraft identification system, in 2016 International Conference on Wireless Networks and Mobile Communications (WINCOM), Oct 2016, pp. 47–51

  43. S. Sekkate, M. Khalil, A. Adib, Speaker identification: a way to reduce call-sign confusion events, in 2017 International Conference on Advanced Technologies for Signal & Image Processing, May 2017

  44. U. Seljuq, F. Himayun, H. Rasheed, Selection of an optimal mother wavelet basis function for ECG signal denoising, in 17th IEEE International Multi Topic Conference 2014, Dec 2014, pp. 26–30

  45. S. Selva Nidhyananthan, R. Shantha Selva Kumari, T. Senthur Selvi, Noise robust speaker identification using RASTA–MFCC feature with quadrilateral filter bank structure. Wirel. Pers. Commun. 91(3), 1321–1333 (2016)

    Article  Google Scholar 

  46. A. Shafik, S.M. Elhalafawy, S.E.M. Diab, B.M. Sallam, F.E.A. El-Samie, A wavelet based approach for speaker identification from degraded speech. IJCNIS 1(3), 52–58 (2009)

    Google Scholar 

  47. Y. Shao, S. Srinivasan, D. Wang, Incorporating auditory feature uncertainties in robust speaker identification, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2007, Honolulu, HI, USA, 15–20 Apr 2007, pp. 277–280

  48. D. Snyder, D. Garcia-Romero, G. Sell, D. Povey, S. Khudanpur, X-vectors: robust DNN embeddings for speaker recognition, in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2018), pp. 5329–5333

  49. M. Stark, T.V. Pham, F. Pernkopf, G. Kubin, H. Hering, Speaker verification for air traffic control, in EUROCONTROL Innovative Research Workshop and Exhibition, Dec 2006

  50. V.N. Vapnik, Statistical Learning Theory. Adaptive and Learning Systems for Signal Processing, Communications, and Control (Wiley, New York, 1998)

    Google Scholar 

  51. Voxforge database. Technical report

  52. A. Vuppala, K.S. Rao, Speaker identification under background noise using features extracted from steady vowel regions. Int. J. Adapt Control Signal Process 27(9), 781–792 (2013)

    Article  Google Scholar 

  53. L. Xu, R.K. Das, E. Yilmaz, J. Yang, H. Li, Generative x-vectors for text-independent speaker verification (2018). CoRR, arXiv:1809.06798

  54. X. Zhao, Y. Shao, D. Wang, Casa-based robust speaker identification. IEEE Trans. Audio Speech Lang. Process. 20(5), 1608–1616 (2012)

    Article  Google Scholar 

  55. X. Zhao, Y. Wang, D. Wang, Robust speaker identification in noisy and reverberant conditions. IEEE Trans. Audio Speech Lang. Process. 22(4), 836–845 (2014)

    Article  Google Scholar 

  56. T.F. Zheng, Q. Jin, L. Li, J. Wang, F. Bie, An overview of robustness related issues in speaker recognition, inSignal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific, Dec 2014, pp. 1–10

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sara Sekkate.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sekkate, S., Khalil, M. & Adib, A. Speaker Identification for OFDM-Based Aeronautical Communication System. Circuits Syst Signal Process 38, 3743–3761 (2019). https://doi.org/10.1007/s00034-019-01026-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-019-01026-z

Keywords

Navigation