Skip to main content

Method of Speech Signal Structuring and Transforming for Biometric Personality Identification

  • Conference paper
  • First Online:
Data Stream Mining & Processing (DSMP 2020)

Abstract

This paper proposes a method for structuring and transforming a speech signal. For this, segmentation method, methods for determining the fundamental tone of the vocal segment and determining on its basis the boundaries of the quasiperiodic oscillations of the vocal segment, the geometric transformation of quasiperiodic oscillations of the vocal segment were suggested. The proposed segmentation of the speech signal uses statistical estimation of short-term energies, which allows the use of an adaptive threshold, thus increasing the vocal segments determination accuracy. The proposed definition of fundamental tone of the vocal segment uses bandpass filtering and statistical estimation of local extremum, which reduces computational complexity, and also reduces noise dependency and allows the use of an adaptive threshold, thus increasing the accuracy of determining the fundamental tone and the boundaries of quasiperiodic oscillations of the vocal segment. The proposed geometric transformation of quasiperiodic oscillations of the vocal segment allows you to transform quasiperiodic oscillations to a single amplitude-time window, which allows you to form patterns of the vocal segment, taking into account its structure. A method for determining a model structure for transforming speech signal patterns is proposed, which is based on a statistical evaluation of the quality of the transforming, which provides a high degree of compression and the speech signal identification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bartlett, M., Movellan, J., Sejnowski, T.: Face recognition by independent component analysis. IEEE Trans. Neural Netw. 13(6), 1450–1464 (2002). https://doi.org/10.1109/TNN.2002.804287

    Article  Google Scholar 

  2. Beigi, H.: Fundamentals of Speaker Recognition. Springer, New York (2011). https://doi.org/10.1007/978-0-387-77592-0

    Book  MATH  Google Scholar 

  3. Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19, 711–720 (1997). https://doi.org/10.1109/34.598228

    Article  Google Scholar 

  4. Bolle, R., Connell, J., Pankanti, S., Ratha, N., Senior, A.: Guide to Biometrics. Springer, New York (2004). https://doi.org/10.1007/978-1-4757-4036-3

    Book  Google Scholar 

  5. Campbell, J.: Speaker recognition: a tutorial. IEEE 85, 1437–1462 (1997). https://doi.org/10.1109/5.628714

    Article  Google Scholar 

  6. Chauhan, V., Dwivedi, S., Karale, P., Potdar, S.: Speech to text converter using gaussian mixture model (gmm). Int. Res. J. Eng. Technol. (IRJET) 3, 160–164 (2016)

    Google Scholar 

  7. Draper, B., Baek, K., Bartlett, M., Beveridge, J.: Recognizing faces with PCA and ICA. Comput. Vis. Image Understand. (Special Issue Face Recognit.) 91(1–2), 115–137 (2003). https://doi.org/10.1016/S1077-3142(03)00077-8

    Article  Google Scholar 

  8. Dunstone, T., Yager, N.: Biometric System and Data Analysis Design, Evaluation, and Data Mining. Springer, New York (2009). https://doi.org/10.1007/978-0-387-77627-9

    Book  Google Scholar 

  9. Fedorov, E., Lukashenko, V., Utkina, T., Lukashenko, A., Rudakov, K.: Method for parametric identification of gaussian mixture model based on clonal selection algorithm. In: CEUR Workshop Proceedings, vol. 2353, pp. 41–55 (2019). https://doi.org/10.15588/1607-3274-2019-2-10

  10. Larin, V.J., Fedorov, E.E.: Combination of PNN network and DTW method for identification of reserved words, used in aviation during radio negotiation. Radioelectron. Commun. Syst. 57(8), 362–368 (2014). https://doi.org/10.3103/S0735272714080044

    Article  Google Scholar 

  11. He, J., Zhang, D.: Face recognition using uniform eigen-space svd on enhanced image for single training sample. J. Comput. Inf. Syst. 7(5), 1655–1662 (2011)

    Google Scholar 

  12. Herbig, T., Gerl, F., Minker, W.: Self-Learning Speaker Identification a System for Enhanced Speech Recognition. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19899-1

    Book  Google Scholar 

  13. Jain, A., Flynn, P., Ross, A.: Handbook of Biometrics. Springer, New York (2008). https://doi.org/10.1007/978-0-387-71041-9

    Book  Google Scholar 

  14. Jeyalakshmi, C., Krishnamurthi, V., Revathi, A.: Speech recognition of deaf and hard of hearing people using hybrid neural network. In: Mechanical and Electronic Engineering (ICMEE 2010), vol. 1, pp. 83–87. (2010). https://doi.org/10.1109/ICMEE.2010.5558589

  15. Keshet, J., Bengio, S.: Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods. John Wiley, Chichester, West Sussex (2009)

    Book  Google Scholar 

  16. Kinnunen, T., Li, H.: An overview of text-independent speaker recognition: from features to supervectors. Elsevier Speech Commun. 52, 12–40 (2010). https://doi.org/10.1016/j.specom.2009.08.009

    Article  Google Scholar 

  17. Li, Q.: Speaker Authentication. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-23731-7

    Book  MATH  Google Scholar 

  18. Markel, J., Gray, A.: Linear Prediction of Speech. Springer, Berlin (1976). https://doi.org/10.1007/978-3-642-66286-7

    Book  MATH  Google Scholar 

  19. Nayana, P., Mathew, D., Thomas, A.: Comparison of text independent speaker identification systems using gmm and i-vector methods. Proc. Comput. Sci. 115, 47–54 (2017). https://doi.org/10.1016/j.procs.2017.09.075

    Article  Google Scholar 

  20. Rabiner, L., Jang, B.: Fundamentals of Speech Recognition. Prentice Hall PTR, Englewood Cliffs (1993)

    Google Scholar 

  21. Reynolds, D.: Automatic speaker recognition using gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3, 1738–1752 (1995)

    Article  Google Scholar 

  22. Reynolds, D.: An overview of automatic speaker recognition technology. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 4, pp. 4072–4075 (2002)

    Google Scholar 

  23. Reynolds, D., Rose, R.: Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3, 72–83 (1995)

    Article  Google Scholar 

  24. Singh, N., Khan, R., Shree, R.: Applications of speaker recognition. Proc. Eng. 38, 3122–3126 (2012). https://doi.org/10.1016/j.proeng.2012.06.363

    Article  Google Scholar 

  25. Togneri, R., Pullela, D.: An overview of speaker identification: accuracy and robustness issues. IEEE Circ. Syst. Mag. 11, 23–61 (2011). https://doi.org/10.1109/MCAS.2011.941079

    Article  Google Scholar 

  26. Zeng, F.Z., Zhou, H.: Speaker recognition based on a novel hybrid algorithm. Proc. Eng. 61, 220–226 (2013). https://doi.org/10.1016/j.proeng.2013.08.007

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eugene Fedorov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fedorov, E., Utkina, T., Nechyporenko, O., Korpan, Y. (2020). Method of Speech Signal Structuring and Transforming for Biometric Personality Identification. In: Babichev, S., Peleshko, D., Vynokurova, O. (eds) Data Stream Mining & Processing. DSMP 2020. Communications in Computer and Information Science, vol 1158. Springer, Cham. https://doi.org/10.1007/978-3-030-61656-4_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61656-4_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61655-7

  • Online ISBN: 978-3-030-61656-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics