Abstract
With the rapid development of mobile Internet, the dimension of speech data is too high and the space is complex. The existing speech retrieval algorithms can not meet the efficient retrieval efficiency and privacy security of speech data in massive applications. Aiming at the problems of retrieval efficiency and accuracy caused by high dimension and complex space of speech feature data, content verifiable retrieval after speech attack, and the security of speech storage and transmission process, a security framework based on KNN Secure Hash (KNNSH) is proposed for verifiable speech retrieval. In this algorithm, the spectral centroid of speech is used as the only input factor, and then KNN classification is used to train and learn the speech vector to obtain each speech centroid. Each speech centroid is assigned a specific hyperchaotic Lorenz compressed sensing encryption algorithm (HL-CS) key, and the security framework is constructed according to the revocable biometric template generated by the combination of classification and specific key. The binary hash vector is generated, and then the hash vector is encrypted by HL-CS. The same encryption algorithm is used to encrypt the original speech. Experimental results show that only one item needs to be matched in the intra class matching process after classification, which improves the retrieval efficiency and accuracy, and realizes the content verification of speech retrieval after content preservation operations. Speech encryption effectively prevents the disclosure of plaintext, ensures the security of speech storage and transmission process. It has a large key space, which is enough to resist exhaustive attacks.
Similar content being viewed by others
References
Dhiman G, Kumar V (2017) Spotted hyena optimizer: a novel bio-inspired based metaheuristic technique for engineering applications. Adv Eng Softw 114:48–70
Dhiman G, Kumar V (2018) Emperor penguin optimizer: a bio-inspired algorithm for engineering problems. Knowl-Based Syst 159:20–50
Dong X, Liu L, Zhu L, Cheng Z, Zhang H (2020) Unsupervised deep k-means hashing for efficient image retrieval and clustering. IEEE Trans Circ Syst Video Technol
Fujiwara M, Waseda A, Nojima R, Moriai S, Ogata W, Sasaki M (2016) Unbreakable distributed storage with quantum key distribution network and password-authenticated secret sharing. Sci Rep 6(1):1–8
Gomez-Barrero M, Rathgeb C, Li G, Ramachandra R, Galbally J, Busch C (2018) Multi-biometric template protection based on bloom filters. Inf Fusion 42:37–50
He K, Wen F, Sun J (2013) K-means hashing: an affinity-preserving quantization method for learning binary compact codes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2938–2945
He X, Wang P, Cheng J (2019) K-nearest neighbors hashing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Heo J-P, Lee Y, He J, Chang S-F, Yoon S-E (2015) Spherical hashing: binary code embedding with hyperspheres. IEEE Trans Pattern Anal Mach Intell 37(11):2304–2316
Huang Y-b, Wang Y, Li H, Zhang Y, Zhang Q-y (2022) Encrypted speech retrieval based on long sequence biohashing. Multimed Tools Appl 81(9):13065–13085
Huang Y-b, Wang Y, Zhang Q-y, Zhang W-z, Fan M-h (2020) Multi-format speech biohashing based on spectrogram. Multimed Tools Appl 79(33):24889–24909
Huang Y-b, Wang Y, Zhang Q-y, Chen T-f (2020) Biohashing encrypted speech retrieval based on chaotic measurement matrix. J Huazhong Univ Sci Technol (Nat Sci Ed) 48(12):32–37
Ji Y, Shao B, Chang J, Bian G (2020) Privacy-preserving certificateless provable data possession scheme for big data storage on cloud, revisited. Appl Math Comput 386:125478
Jin S, Yao H, Sun X, Zhou S, Zhang L, Hua X (2020) Deep saliency hashing for fine-grained retrieval. IEEE Trans Image Process 29:5336–5351
Kamper H, Shakhnarovich G, Livescu K (2018) Semantic speech retrieval with a visually grounded model of untranscribed speech. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 2514–2517
Kaur S, Awasthi LK, Sangal A, Dhiman G (2020) Tunicate swarm algorithm: a new bio-inspired based metaheuristic paradigm for global optimization. Eng Appl Artif Intell 90:103541
Kinnunen T, Sahidullah M, Delgado H, Todisco M, Evans N, Yamagishi J, Lee KA (2017) The asvspoof 2017 challenge: assessing the limits of replay spoofing attack detection
Lai Y, Jin Z, Wong K, Tistarelli M (2021) Efficient known-sample attack for distance-preserving hashing biometric template protection schemes. IEEE Trans Inf Forensic Secur 16:3170–3185
Langenberg B, Pham H, Steinwandt R (2020) Reducing the cost of implementing the advanced encryption standard as a quantum circuit. IEEE Trans Quantum Eng 1:1–12
Liang W, Fan Y, Li K-C, Zhang D, Gaudiot J-L (2020) Secure data storage and recovery in industrial blockchain network environments. IEEE Trans Indus Inf 16(10):6543–6552
Liu Y, Zhang S (2020) Information security and storage of internet of things based on block chains. Futur Gener Comput Syst 106:296–303
Lu H, Zhang M, Xu X, Li Y, Shen HT (2020) Deep fuzzy hashing network for efficient image retrieval. IEEE Trans Fuzzy Syst
Mai G, Cao K, Lan X, Yuen PC (2020) Secureface: face template protection. IEEE Trans Inf Forensic Secur 16:262–277
Ng WW, Li J, Tian X, Wang H (2022) Bit-wise attention deep complementary supervised hashing for image retrieval. Multimed Tools Appl 81(1):927–951
Patil AT, Acharya R, Patil HA, Guido RC (2022) Improving the potential of enhanced teager energy cepstral coefficients (etecc) for replay attack detection. Comput Speech Lang 72:101281
Qian Y, Chen Z, Wang S (2021) Audio-visual deep neural network for robust person verification. IEEE/ACM Trans Audio Speech Lang Proc 29:1079–1092. https://doi.org/10.1109/TASLP.2021.3057230https://doi.org/10.1109/TASLP.2021.3057230
Shen HT, Liu L, Yang Y, Xu X, Huang Z, Shen F, Hong R (2020) Exploiting subspace relation in semantic labels for cross-modal hashing. IEEE Trans Knowl Data Eng
Talreja V, Valenti MC, Nasrabadi NM (2020) Deep hashing for secure multimodal biometrics. IEEE Trans Inf Forensic Secur 16:1306–1321
Wang X, Chen W, Mei S, Chen X (2015) Optically secured information retrieval using two authenticated phase-only masks. Sci Rep 5(1):1–9
Wang Y, Huang Y-b, Zhang R, Zhang Q-y (2021) Multi-format speech biohashing based on energy to zero ratio and improved lp-mmse parameter fusion. Multimed Tools Appl 80(7):10013–10036
Wang Y, Song J, Zhou K, Liu Y (2021) Unsupervised deep hashing with node representation for image retrieval. Pattern Recogn 112:107785
Wang X, Zhang Z, Wu B, Shen F, Lu G (2021) Prototype-supervised adversarial network for targeted attack of deep hashing. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16357–16366
Weng Z, Zhu Y, Lan Y, Huang L-K (2019) A fast online spherical hashing method based on data sampling for large scale image retrieval. Neurocomputing 364:209–218
Xie H, Virtanen T (2021) Zero-shot audio classification via semantic embeddings. IEEE/ACM Trans Audio Speech Lang Process 29:1233–1242. https://doi.org/10.1109/TASLP.2021.3065234
Xu L, Zeng X, Zheng B, Li W (2022) Multi-manifold deep discriminative cross-modal hashing for medical image retrieval. IEEE Trans Image Process
Yan C, Gong B, Wei Y, Gao Y (2020) Deep multi-view enhancement hashing for image retrieval. IEEE Trans Pattern Anal Mach Intell
Zhang Z, Liu L, Luo Y, Huang Z, Shen F, Shen HT, Lu G (2020) Inductive structure consistent hashing via flexible semantic calibration. IEEE Trans Neural Netw Learn Syst
Zhang Q-y, Ge Z-x, Hu Y-j, Bai J, Huang Y-b (2020) An encrypted speech retrieval algorithm based on chirp-z transform and perceptual hashing second feature extraction. Multimed Tools Appl 79(9):6337–6361
Zhang Y, Xie F, Song X, Zheng Y, Liu J, Wang J (2022) Dermoscopic image retrieval based on rotation-invariance deep hashing. Med Image Anal 77:102301
Zhu X, Li X, Zhang S, Xu Z, Yu L, Wang C (2017) Graph pca hashing for similarity search. IEEE Trans Multimed 19(9):2033–2044
Acknowledgements
This work is supported by the National Natural Science Foundation of China(No.61862041), Science and Technology program of Gansu Province of China(No.21JR7RA120) Young doctor fund project of Gansu Provincial Department of Education(2022QB-033).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
An, L., Huang, Yb. & Zhang, Qy. Verifiable speech retrieval algorithm based on KNN secure hashing. Multimed Tools Appl 82, 7803–7824 (2023). https://doi.org/10.1007/s11042-022-13387-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-13387-w