Abstract
The diffusion of Device-to-Device (D2D) communications opens the door to exploit the contributions of multiple Mobile Devices (MDs) to accomplish collaborative tasks. In this paper a speaker recognition algorithm for MDs based on a multiple-observations approach is presented. We propose various fusion and clustering algorithms aimed at efficiently exploiting data coming from MDs. Numerical results show that in many cases our multiple-observation approach is able to significantly improve the accuracy of the considered speaker recognition algorithm.
Similar content being viewed by others
References
DSP speech audio samples. http://www.dsp.diten.unige.it/images/download/speech_db_monet.zip
Bao HC, Juan ZC (2012) The research of speaker recognition based on gmm and svm. In: 2012 international conference on system science and engineering (ICSSE), pp 373–375
Barghi A, Bayani H (2014) Design and impelmentation of a speaker verification system using i-vector and support vector machines. In: 2014 second RSI/ISM international conference on robotics and mechatronics (ICROm), pp 434–439
Bisio I, Delfino A, Lavagetto F, Marchese M, Sciarrone A (2013) Gender-driven emotion recognition through speech signals for ambient intelligence applications. IEEE Trans Emerg Topics Comput 1(2):244–257
Bisio I, Lavagetto F, Marchese M, Sciarrone A, Frá C, Valla M (2015) Spectra: A speech processing platform as smartphone application. In: 2015 IEEE international conference on communications (ICC), pp 7030–7035
Chang C-C, Lin C-J (2011) LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm
Golan SM, Gannot S, Cohen I (2010) Subspace tracking of multiple sources and its application to speakers extraction. In: International conference on acoustics, speech and signal processing, pp 201–204
Hansen JHL, Hasan T (2015) Speaker recognition by machines and humans: a tutorial review. IEEE Signal Proc Mag 32(6):74–99
Hermansky H (1990) Perceptual linear predictive (plp) analysis of speech. J Acoust Soc Am 87(4):1738–1752
Homayounpour MM, Rezaian I (2008) Robust speaker verification based on multi stage vector quantization of mfcc parameters on narrow bandwidth channels. In: International conference on advanced communication technology, vol 1, pp 336–340
Hsu C-W, Lin C-J (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425
Li H, Ma B, Lee K-A, Sun H, Zhu D, Sim KC, You C, Tong R, Kärkkäinen I, Huang C-L et al (2009) The i4u system in nist 2008 speaker recognition evaluation. In: International conference on acoustics, speech and signal processing, pp 4201–4204
Liu Y, Fu T, Fan Y, Qian Y, Yu K (2014) Speaker verification with deep features. In: 2014 International joint conference on neural networks (IJCNN), pp 747–753
McLaren M, van Leeuwen D (2012) Source-norMalized lda for robust speaker recognition using i-vectors from multiple speech sources. IEEE Trans Audio Speech Lang Process 20(3):755–766
Moattar MH, Homayounpour MM (2009) A simple but efficient real-time voice activity detection algorithm Signal Processing Conference, 2009 17th European, pp 2549–2553
Reynolds DA, Rose RC (1995) Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Trans Speech Audio Process 3(1):72–83
Stolcke A, Friedland G, Imseng D (2010) Leveraging speaker diarization for meeting recognition from distant microphones. In: 2010 IEEE International conference on acoustics speech and signal processing (ICASSP), pp 4390–4393
Tripathy A, Kumar L, Hegde RM (2012) Robust two dimensional source localization using the music-group delay spectrum International Conference on Signal Processing and Communications (SPCOM), pp 1–5
Acknowledgments
This work has been partially funded by TIM S.p.A., Services Innovation Department, Joint Open Lab S-Cube, Italy, Milan.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bisio, I., Lavagetto, F., Garibotto, C. et al. Speaker Recognition Exploiting D2D Communications Paradigm: Performance Evaluation of Multiple Observations Approaches. Mobile Netw Appl 22, 1045–1057 (2017). https://doi.org/10.1007/s11036-017-0876-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11036-017-0876-z