Skip to main content

Advertisement

Log in

A high-performance speech perceptual hashing authentication algorithm based on discrete wavelet transform and measurement matrix

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Aiming at the problems of existing speech authentication algorithms, such as poor robustness and discrimination, security vulnerability, low efficiency, poor ability of tamper detection and localization, a high-performance speech perceptual hashing authentication algorithm based on Discrete Wavelet Transform (DWT) and measurement matrix is proposed in this paper. Firstly, the speech signal is conducted with DWT by applying preprocessing, and the low-frequency wavelet coefficients are regarded as the perceptual feature value. Then the measurement matrix controlled by chaos map is applied to reduce the dimension of feature value. Finally, the feature value is used to generate the perceptual hash sequence by the process of hashing structure. The measurement matrix is designed as the secret key to enhance the security of the proposed algorithm. The experimental results demonstrates the proposed algorithm has high efficiency in perceptual robustness, discrimination, time consumption and security, as well as having a high accuracy of tampering detection and localization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Adibi S (2014) A low overhead scaled equalized harmonic-based voice authentication system. Telematics Inform 31(1):137–152. https://doi.org/10.1016/j.tele.2013.02.004

    Article  Google Scholar 

  2. Chen N, Wan WG (2009) Speech hashing algorithm based on short-time stability. In: International Conference on Artificial Neural Networks. Springer Berlin Heidelberg, p 426–434. https://doi.org/10.1007/978-3-642-04277-5_43

  3. Chen N, Wan W (2010) Robust speech hash function. ETRI J 32(2):345–347. https://doi.org/10.4218/etrij.10.0209.0309

    Article  MathSciNet  Google Scholar 

  4. Chen N, Xiao HD (2013) Perceptual audio hashing algorithm based on Zernike moment and maximum-likelihood watermark detection. Digital Signal Process 23(4):1216–1227. https://doi.org/10.1016/j.dsp.2013.01.012

    Article  MathSciNet  Google Scholar 

  5. Chen N, Wan W, Xiao HD (2010) Robust audio hashing based on discrete-wavelet-transform and non-negative matrix factorization. IET Commun 4(14):1722–1731. https://doi.org/10.1049/iet-com.2009.0749

    Article  MathSciNet  MATH  Google Scholar 

  6. Chen N, Xiao HD, Zhu J, Lin JJ, Wang Y, Yuan WH (2013) Robust audio hashing scheme based on cochleagram and cross-recurrence analysis. Electron Lett 49(1):7–8. https://doi.org/10.1049/el.2012.3812

    Article  Google Scholar 

  7. Huang Y, Zhang Q, Yuan Z, Yang Z (2015) The hash algorithm of speech perception based on the integration of adaptive MFCC and LPCC. J Huazhong Univ Sci Tech (Natural Science Edition) 43(2):124–128. https://doi.org/10.13245/j.hust.150226

    MATH  Google Scholar 

  8. Jiao Y, Ji L, Niu X (2009) Robust speech hashing for content authentication. IEEE Signal Process Lett 16(9):818–821. https://doi.org/10.1109/LSP.2009.2025827

    Article  Google Scholar 

  9. Kim HG, Cho HS, Kim JY (2016) Robust audio fingerprinting using a peak-pair-based hash of non-repeating foreground audio in a real environment. Clust Comput 19(1):315–323. https://doi.org/10.1007/s10586-015-0523-z

    Article  Google Scholar 

  10. Li J, Wu T, Wang H (2015) Perceptual hashing based on the correlation coefficient of MFCC for speech authentication. J Beijing Univ Posts Telecommun 38(2):89–93. https://doi.org/10.13190/j.jbupt.2015.02.016

    Google Scholar 

  11. Li J, Wang H, Jing Y (2015) Audio perceptual hashing based on NMF and MDCT coefficients. Chin J Electron 24(3):579–588. https://doi.org/10.1049/cje.2015.07.024

    Article  Google Scholar 

  12. Lotia P, Khan DM (2013) Significance of complementary spectral features for speaker recognition. IJRCCT 2(8):579–588

    Google Scholar 

  13. Lu X, Matsuda S, Unoki M, Nakamura S (2011) Temporal modulation normalization for robust speech feature extraction and recognition. Multimedia Tools Applications 52(1):187–199. https://doi.org/10.1007/s11042-010-0465-7

    Article  Google Scholar 

  14. Nouri M, Farhangian N, Zeinolabedini Z, Safarinia M (2012) Conceptual authentication speech hashing base upon hypotrochoid graph. In: Telecommunications (IST), 2012 Sixth International Symposium on. IEEE 1136–1141. https://doi.org/10.1109/ISTEL.2012.6483157

  15. Özer H, Sankur B, Memon N, Anarım E (2005) Perceptual audio hashing functions. EURASIP J Adv Signal Process 12:1780–1793. https://doi.org/10.1155/ASP.2005.178

    Google Scholar 

  16. Panagiotou V, Mitianoudis N (2013) PCA summarization for audio song identification using Gaussian mixture models. In: Digital Signal Processing (DSP), 2013 18th International Conference on. IEEE 1–6. https://doi.org/10.1109/ICDSP.2013.6622803

  17. Ramona M, Peeters G (2011) Audio identification based on spectral modeling of bark-bands energy and synchronization through onset detection. In: Acoustics, Speech and Signal Processing (ICASSP), 2011 I.E. International Conference on. IEEE, p 477–480. https://doi.org/10.1109/ICASSP.2011.5946444

  18. Wang ZR, Li W, Zhu BL, Li XQ (2012) Audio authentication based on music content analysis. J Comput Res Dev 49(1):158–166

    Google Scholar 

  19. Zhao H, He S (2016) A retrieval algorithm for encrypted speech based on perceptual hashing. In: Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), 2016 12th International Conference on. IEEE 1840–1845. https://doi.org/10.1109/FSKD.2016.7603458

  20. Zhao H, Liu H, Zhao K, Yang Y (2011) Robust speech feature extraction using the Hilbert transform spectrum estimation method. Int J Digital Content Technol Appl 5(12):85–95

    Article  Google Scholar 

  21. Zhou N, Zhang A, Zheng F, Gong L (2014) Novel image compression–encryption hybrid algorithm based on key-controlled measurement matrix in compressive sensing. Opt Laser Technol 62:152–160. https://doi.org/10.1016/j.optlastec.2014.02.015

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (61363078), the Natural Science Foundation of Gansu Province of China (1606RJYA274), the Open Research Fund of National Mobile Communications Research Laboratory, Southeast University (2014D13). The authors would like to thank the anonymous reviewers for their helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiu-yu Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Qy., Qiao, Sb., Huang, Yb. et al. A high-performance speech perceptual hashing authentication algorithm based on discrete wavelet transform and measurement matrix. Multimed Tools Appl 77, 21653–21669 (2018). https://doi.org/10.1007/s11042-018-5613-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-5613-5

Keywords