Skip to main content

Advertisement

Log in

A Cancelable Biometric Approach for Efficient Identification of Speakers from Encrypted Speech

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

Biometric systems based on feature extraction from speech signals are extensively deployed in different security systems. This work presents a secure speaker identification system, namely cancelable speaker identification system, which can be further used for remote access applications. Two formulations for efficient cryptosystems based on a chaotic map and the empirical mode decomposition (EMD) with a single key are presented for this purpose. The encryption achieves a large degree of security in the speaker identification systems. In the proposed cancelable speaker identification scenario, cepstral features are extracted from encrypted speech signals. The matching in this proposed scenario is performed with artificial neural networks (ANNs). Furthermore, various enhancement methods are implemented at the receiver side to remove the noise effect prior to speaker identification. Consequently, the features of the speakers are protected from the attackers. The simulation results prove that the proposed cancelable biometric system has an outstanding performance level compared to the traditional biometric systems. It achieves a recognition rate up to 100% at a − 20 dB signal-to-noise ratio (SNR) and above, using the two proposed cryptosystems with a high level of security.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Reynolds, D. A. (2002). An overview of automatic speaker recognition technology. In 2002 IEEE international conference on in Acoustics, speech, and signal processing (ICASSP) (pp. IV-4072-IV-4075).

  2. de Lara, J. R. C. (2005). A method of automatic speaker recognition using cepstral features and vectorial quantization. In Iberoamerican congress on pattern recognition (pp. 146–153).

  3. Saksamudre, S.K., Shrishrimal, P.P., Deshmukh, R.R. A review on different approaches for speech recognition system. International Journal of Computer Applications, 115(22.)‏

  4. Sotelo, E. E., Nakamura, T., Nagai, T., & Hernandez, E. E. (2012) Who said that? The crossmodal matching identity for inferring unfamiliar faces from voices. In 2012 Eighth international conference on signal image technology and internet based systems (SITIS) (pp. 97–104).

  5. Chauhan, N., Isshiki, T., & Li, D. (2019) Speaker recognition using LPC, MFCC, ZCR features with ANN and SVM classifier for large input database. In IEEE 4th international conference on computer and communication systems (ICCCS). IEEE (pp. 130–133).

  6. Abd El-Wahab, B. S., El-khobby, H. A., Abd Elnaby, M. M., & Abd El-Samie, F. E. (2011) Simultaneous speaker identification and watermarking. International Journal of Speech Technology, pp. 1–14.‏

  7. Kekre, H., Athawale, A., Desai, M. (2011). Speaker identification using row mean vector of spectrogram. In Proceedings of the international conference & workshop on emerging trends in technology (pp. 171–174).

  8. Verma, G. K. (2011) Multi-feature fusion for closed set text independent speaker identification. In International conference on information intelligence, systems, technology and management (pp. 170–179).

  9. Dutta, M., Patgiri, C., Sarma, M., & Sarma, K. K. (2015). Closed-set text-independent speaker identification system using multiple ANN classifiers. In Proceedings of the 3rd international conference on frontiers of intelligent computing: Theory and applications (FICTA) 2014 (pp. 377–385).

  10. Kinnunen, T. (2003). Spectral features for automatic text-independent speaker recognition. Licentiate’s Thesis.

  11. Kurzekar, P. K., Deshmukh, R. R., Waghmare, V. B., & Shrishrimal, P. P. (2014). A comparative study of feature extraction techniques for speech recognition system. International Journal of Innovative Research in Science, Engineering and Technology, 3, 18006–18016.

    Article  Google Scholar 

  12. Ratha, N. K., Connell, J. H., & Bolle, R. M. (2001). Enhancing security and privacy in biometrics-based authentication systems. IBM Systems Journal, 40, 614–634.

    Article  Google Scholar 

  13. Ratha, N. K., Chikkerur, S., Connell, J. H., & Bolle, R. M. (2007). Generating cancelable fingerprint templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 561–572.

    Article  Google Scholar 

  14. Ying, C. L., & Jin, A. T. B. (2007). Probabilistic random projections and speaker verification. In International conference on biometrics (pp. 445–454).

  15. Wang, Y., & Plataniotis, K. N. (2010). An analysis of random projection for changeable and privacy-preserving biometric verification. IEEE Transactions on Systems, Man, and Cybernetics Part B (Cybernetics), 40, 1280–1293.

    Article  Google Scholar 

  16. Teoh, A. B. J., & Chong, L.-Y. (2010). Secure speech template protection in speaker verification system. Speech Communication, 52, 150–163.

    Article  Google Scholar 

  17. Jin, A. T. B., Ling, D. N. C., & Goh, A. (2004). Biohashing: Two factor authentication featuring fingerprint data and tokenised random number. Pattern Recognition, 37(2004), 2245–2255.

    Article  Google Scholar 

  18. Teoh, A. B., Goh, A., & Ngo, D. C. (2006). Random multispace quantization as an analytic mechanism for biohashing of biometric and random identity inputs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1892–1901.

    Article  Google Scholar 

  19. Hu, H. T., Lin, S. J., & Hsu, L. Y. (2017). Effective blind speech watermarking via adaptive mean modulation and package synchronization in DWT domain. EURASIP Journal on Audio, Speech, and Music Processing, 1, 10.

    Article  Google Scholar 

  20. Hammad, M., Luo, G., & Wang, K. (2019). Cancelable biometric authentication system based on ECG. Multimedia Tools and Applications, 78, 1857–1887.

    Article  Google Scholar 

  21. Tulyakov, S., Farooq, F., & Govindaraju, V. (2005) Symmetric hash functions for fingerprint minutiae. In International conference on pattern recognition and image analysis (pp. 30–38).

  22. Juels, A., & Sudan, M. (2006). A fuzzy vault scheme. Designs, Codes and Cryptography, 38, 237–257.

    Article  MathSciNet  MATH  Google Scholar 

  23. Xu, W., & Cheng, M. (2008). Cancelable voiceprint template based on chaff-points-mixture method. In CIS'08. International conference on computational intelligence and security, 2008 (pp. 263–266).

  24. Sandyarani, K., & Kumar, P. N. (2018). Efficient substructure sharing methods for optimizing the composite s-box, mixcolumn and inverse mixcolumn in rijndael advanced encryption standard. Journal of Computational and Theoretical Nanoscience, 15(3), 798–810.

    Article  Google Scholar 

  25. Ds, A., & Minu, K. (2013). Frequency speech scrambler based on hartley transform and OFDM algorithm. International Journal of Computer Applications, 61(8), 36–40.

    Article  Google Scholar 

  26. Brasser, F., et al. (2018). VoiceGuard: Secure and private speech processing. In Interspeech (pp. 1303–1307)

  27. Alroubaie, Z. M., Hashem, M. A., & Hasan, F. S. (2019). FPGA design of encryption speech system using synchronized fixed-point chaotic maps based stream ciphers. International Journal of Engineering and Advanced Technology, 8(6), 1534–1541.

    Article  Google Scholar 

  28. Milton, R. (1989). A time and frequency-domain speech scrambler. In COMSIG 1989. Proceedings Communications and Signal Processing, 1989, Southern African Conference on, 1989 (pp. 125–130).

  29. Sadkhan, S., & Abbas, N. (2012). Speech scrambling based on wavelet transform. In Advances in wavelet theory and their applications in engineering physics and technology (pp. 41–58).‏

  30. Manjunath, G., & Anand, G. (2002). Speech encryption using circulant transformations. In ICME'02. Proceedings. 2002 IEEE international conference on multimedia and expo, 2002 (pp. 553–556).

  31. Wu, Y., Ng, B. P. (2002). Speech scrambling with Hadamard transform in frequency domain. In 2002 6th International conference on signal processing (pp. 1560–1563).

  32. Jiao, Ge., Li, L., & Zou, Yi. (2019). Improved security for android system based on multi-chaotic maps using a novel image encryption algorithm. International Journal of Performability Engineering, 15(6), 1692.

    Google Scholar 

  33. Al Saad, S. N., & Hato, E. (2014). A speech encryption based on chaotic maps. International Journal of Computer Applications, 93, 19–28.

    Google Scholar 

  34. Zhai, Y., Lin, S., & Zhang, Q. (2008) Improving image encryption using multi-chaotic map. In Workshop on power electronics and intelligent transportation system, 2008. PEITS'08 (pp. 143–148).

  35. Ratnavelu, K., et al. (2017). Image encryption method based on chaotic fuzzy cellular neural networks. Signal Processing, 140, 87–96.

    Article  Google Scholar 

  36. Wu, X., Zhu, B., Hu, Y., & Ran, Y. (2017). A novel color image encryption scheme using rectangular transform-enhanced chaotic tent maps. IEEE Access, 5, 6429–6436.

    Google Scholar 

  37. Hassan, E. S., Zhu, X., El-Khamy, S. E., Dessouky, M. I., El-Dolil, S. A., & El-Samie, F. E. A. (2012). A chaotic interleaving scheme for the continuous phase modulation based single-carrier frequency-domain equalization system. Wireless Personal Communications, 62, 183–199.

    Article  MATH  Google Scholar 

  38. Khaldi, K., & Boudraa, A.-O. (2012). On signals compression by EMD. Electronics Letters, 48(21), 1329–1331.

    Article  Google Scholar 

  39. Khaldi, K., Turki-Hadj Alouane, M. O. N. I. A., & Boudraa, A. O. (2010). Voiced speech enhancement based on adaptive filtering of selected intrinsic mode functions. Advances in Adaptive Data Analysis, 2(01), 65–80.

    Article  MathSciNet  Google Scholar 

  40. Tirumala, S. S., Shahamiri, S. R., Garhwal, A. S., & Wang, R. (2017). Speaker identification features extraction methods: A systematic review. Expert Systems With Applications, 90, 250–271.

    Article  Google Scholar 

  41. Neville, K. L., & Hussain, Z. M. (2009). Effects of wavelet compression of speech on its Mel-Cepstral coefficients. In International conference on communication, computer and power (ICCCP’09), Muscat (pp. 387–390).

  42. Gupta, S., Jaafar, J., Ahmad, W. F. W., & Bansal, A. (2013). Feature extraction using MFCC. Signal & Image Processing, 4, 101–108.

    Google Scholar 

  43. Childers, D. G., Skinner, D. P., & Kemerait, R. C. (1977). The cepstrum: A guide to processing. Proceedings of the IEEE, 65, 1428–1443.

    Article  Google Scholar 

  44. Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint https://arxiv.org/abs/1003.4083.

  45. Yu, H., Tan, Z. H., Ma, Z., Martin, R., & Guo, J. (2017). Spoofing detection in automatic speaker verification systems using DNN classifiers and dynamic acoustic features. IEEE Transactions on Neural Networks and Learning Systems, 29(10), 4633–4644.

    Article  Google Scholar 

  46. Galushkin, A. I. (2007). Neural networks theory. . Springer.

    MATH  Google Scholar 

  47. Suzuki, K. (Ed.). (2013). Artificial neural networks: Architectures and applications. BoD–Books on Demand

  48. Evans, N. W., Mason, J. S., Liu, W. M., & Fauve, B. (2006). An assessment on the fundamental limitations of spectral subtraction. In 2006 IEEE international conference on acoustics speech and signal processing proceedings (Vol. 1, pp. 1–1). IEEE.

  49. Purushotham, U., & Suresh, K. (2018). Implementation of spectral subtraction using sub-band filtering in DSP C6748 processor for enhancing speech signal. In Advances in machine learning and data science (pp. 259–267). Springer, Singapore.

  50. Abd El-Fattah, M., Dessouky, M. I., Diab, S. M., & AbdEl-Samie, F.E.-S. (2008). Speech enhancement using an adaptive wiener filtering approach. Progress in Electromagnetics Research, 4, 167–184.

    Article  Google Scholar 

  51. Macq, B., Dittmann, J., & Delp, E. J. (2004). Benchmarking of image watermarking algorithms for digital rights management. Proceedings of the IEEE, 92, 971–984.

    Article  Google Scholar 

  52. Et-Khamy, S. E., Hadhoud, M. M., Dessouky, M. L., Salam, B. M., & Abd El-Sarnie, F. E. (2003). Sectioned implementation of regularized image interpolation. In 2003 46th midwest symposium on circuits and systems (Vol. 2, pp. 656–659). IEEE.

  53. Rahimizadeh, N., Hasanzadeh, R. P., & Janabi-Sharifi, F. (2020). An optimized non-local LMMSE approach for speckle noise reduction of medical ultrasound images. Multimedia Tools and Applications, 1–23.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heba A. El-Khobby.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abd El-Wahab, B.S., El-Khobby, H.A., Elnaby, M.M.A. et al. A Cancelable Biometric Approach for Efficient Identification of Speakers from Encrypted Speech. Wireless Pers Commun 124, 1899–1921 (2022). https://doi.org/10.1007/s11277-021-08384-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-021-08384-5

Keywords

Navigation