Abstract
Aiming at the problems of existing speech authentication algorithms, such as poor robustness and discrimination, security vulnerability, low efficiency, poor ability of tamper detection and localization, a high-performance speech perceptual hashing authentication algorithm based on Discrete Wavelet Transform (DWT) and measurement matrix is proposed in this paper. Firstly, the speech signal is conducted with DWT by applying preprocessing, and the low-frequency wavelet coefficients are regarded as the perceptual feature value. Then the measurement matrix controlled by chaos map is applied to reduce the dimension of feature value. Finally, the feature value is used to generate the perceptual hash sequence by the process of hashing structure. The measurement matrix is designed as the secret key to enhance the security of the proposed algorithm. The experimental results demonstrates the proposed algorithm has high efficiency in perceptual robustness, discrimination, time consumption and security, as well as having a high accuracy of tampering detection and localization.







Similar content being viewed by others
References
Adibi S (2014) A low overhead scaled equalized harmonic-based voice authentication system. Telematics Inform 31(1):137–152. https://doi.org/10.1016/j.tele.2013.02.004
Chen N, Wan WG (2009) Speech hashing algorithm based on short-time stability. In: International Conference on Artificial Neural Networks. Springer Berlin Heidelberg, p 426–434. https://doi.org/10.1007/978-3-642-04277-5_43
Chen N, Wan W (2010) Robust speech hash function. ETRI J 32(2):345–347. https://doi.org/10.4218/etrij.10.0209.0309
Chen N, Xiao HD (2013) Perceptual audio hashing algorithm based on Zernike moment and maximum-likelihood watermark detection. Digital Signal Process 23(4):1216–1227. https://doi.org/10.1016/j.dsp.2013.01.012
Chen N, Wan W, Xiao HD (2010) Robust audio hashing based on discrete-wavelet-transform and non-negative matrix factorization. IET Commun 4(14):1722–1731. https://doi.org/10.1049/iet-com.2009.0749
Chen N, Xiao HD, Zhu J, Lin JJ, Wang Y, Yuan WH (2013) Robust audio hashing scheme based on cochleagram and cross-recurrence analysis. Electron Lett 49(1):7–8. https://doi.org/10.1049/el.2012.3812
Huang Y, Zhang Q, Yuan Z, Yang Z (2015) The hash algorithm of speech perception based on the integration of adaptive MFCC and LPCC. J Huazhong Univ Sci Tech (Natural Science Edition) 43(2):124–128. https://doi.org/10.13245/j.hust.150226
Jiao Y, Ji L, Niu X (2009) Robust speech hashing for content authentication. IEEE Signal Process Lett 16(9):818–821. https://doi.org/10.1109/LSP.2009.2025827
Kim HG, Cho HS, Kim JY (2016) Robust audio fingerprinting using a peak-pair-based hash of non-repeating foreground audio in a real environment. Clust Comput 19(1):315–323. https://doi.org/10.1007/s10586-015-0523-z
Li J, Wu T, Wang H (2015) Perceptual hashing based on the correlation coefficient of MFCC for speech authentication. J Beijing Univ Posts Telecommun 38(2):89–93. https://doi.org/10.13190/j.jbupt.2015.02.016
Li J, Wang H, Jing Y (2015) Audio perceptual hashing based on NMF and MDCT coefficients. Chin J Electron 24(3):579–588. https://doi.org/10.1049/cje.2015.07.024
Lotia P, Khan DM (2013) Significance of complementary spectral features for speaker recognition. IJRCCT 2(8):579–588
Lu X, Matsuda S, Unoki M, Nakamura S (2011) Temporal modulation normalization for robust speech feature extraction and recognition. Multimedia Tools Applications 52(1):187–199. https://doi.org/10.1007/s11042-010-0465-7
Nouri M, Farhangian N, Zeinolabedini Z, Safarinia M (2012) Conceptual authentication speech hashing base upon hypotrochoid graph. In: Telecommunications (IST), 2012 Sixth International Symposium on. IEEE 1136–1141. https://doi.org/10.1109/ISTEL.2012.6483157
Özer H, Sankur B, Memon N, Anarım E (2005) Perceptual audio hashing functions. EURASIP J Adv Signal Process 12:1780–1793. https://doi.org/10.1155/ASP.2005.178
Panagiotou V, Mitianoudis N (2013) PCA summarization for audio song identification using Gaussian mixture models. In: Digital Signal Processing (DSP), 2013 18th International Conference on. IEEE 1–6. https://doi.org/10.1109/ICDSP.2013.6622803
Ramona M, Peeters G (2011) Audio identification based on spectral modeling of bark-bands energy and synchronization through onset detection. In: Acoustics, Speech and Signal Processing (ICASSP), 2011 I.E. International Conference on. IEEE, p 477–480. https://doi.org/10.1109/ICASSP.2011.5946444
Wang ZR, Li W, Zhu BL, Li XQ (2012) Audio authentication based on music content analysis. J Comput Res Dev 49(1):158–166
Zhao H, He S (2016) A retrieval algorithm for encrypted speech based on perceptual hashing. In: Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), 2016 12th International Conference on. IEEE 1840–1845. https://doi.org/10.1109/FSKD.2016.7603458
Zhao H, Liu H, Zhao K, Yang Y (2011) Robust speech feature extraction using the Hilbert transform spectrum estimation method. Int J Digital Content Technol Appl 5(12):85–95
Zhou N, Zhang A, Zheng F, Gong L (2014) Novel image compression–encryption hybrid algorithm based on key-controlled measurement matrix in compressive sensing. Opt Laser Technol 62:152–160. https://doi.org/10.1016/j.optlastec.2014.02.015
Acknowledgments
This work is supported by the National Natural Science Foundation of China (61363078), the Natural Science Foundation of Gansu Province of China (1606RJYA274), the Open Research Fund of National Mobile Communications Research Laboratory, Southeast University (2014D13). The authors would like to thank the anonymous reviewers for their helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, Qy., Qiao, Sb., Huang, Yb. et al. A high-performance speech perceptual hashing authentication algorithm based on discrete wavelet transform and measurement matrix. Multimed Tools Appl 77, 21653–21669 (2018). https://doi.org/10.1007/s11042-018-5613-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5613-5