Echo lite voice fusion network: advancing underwater acoustic voiceprint recognition with lightweight neural architectures

Wu, Jiaqi; Guan, Donghai; Yuan, Weiwei

doi:10.1007/s10489-024-06035-3

Echo lite voice fusion network: advancing underwater acoustic voiceprint recognition with lightweight neural architectures

Published: 10 December 2024

Volume 55, article number 112, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Jiaqi Wu¹,
Donghai Guan¹ &
Weiwei Yuan¹

58 Accesses
Explore all metrics

Abstract

Underwater acoustic voiceprint recognition, serving as a key technology in the field of biometric identification, presents a wide range of application prospects, especially in areas such as marine resource development, underwater communication, and underwater safety monitoring. Conventional acoustic voiceprint recognition methods exhibit limitations in underwater environments, prompting the need for a lightweight neural network approach to optimally address underwater acoustic voiceprint recognition tasks. This paper introduces a novel lightweight voicing recognition model, the Echo Lite Voice Fusion Network (ELVFN), which incorporates depthwise separable convolution and self-attention mechanism, and significantly improves voicing recognition performance by optimizing acoustic feature extraction technology and hierarchical feature fusion strategy. Concurrently, the computational complexity and parameter quantity of the model are substantially reduced. Comparative analyses with existing acoustic voiceprint recognition models corroborate the superior performance of our model across multiple underwater acoustic datasets. Experimental results demonstrate that ELVFN outperforms in various evaluation metrics, notably in terms of processing efficiency and recognition accuracy. Finally, we discuss the application potential and future development directions of the model, providing an efficient solution for underwater acoustic voiceprint recognition in resource-constrained environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel deep transfer learning models for recognition of birds sounds in different environment

Article 14 January 2022

Text-Independent Voiceprints Identification Using Feed-Forward Back-Propagation with Layered Strategies

A stacked convolutional neural network framework with multi-scale attention mechanism for text-independent voiceprint recognition

Article 27 April 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The data that support the findings of this study are available from the corresponding author, Guan, upon reasonable request.

References

Urick RJ (1983) Principles of Underwater Sound; McGraw-Hill Book Co: Los Angeles. CA, USA, p 423p
Kenny AJ, Cato I, Desprez M, Fader G, Schüttenhelm RTE, Side J (2003) An overview of seabed-mapping technologies in the context of marine habitat classification. ICES J Mar Sci 60:411–418
Article Google Scholar
Testolin A, Kipnis D, Diamant R (2022) Detecting submerged objects using active acoustics and deep neural networks: a test case for pelagic fish. IEEE Trans Mob Comput 21:2776–2788
Article MATH Google Scholar
Testolin A, Diamant R (2020) Combining denoising autoencoders and dynamic programming for acoustic detection and tracking of underwater moving targets. Sensors 20:2945
Article MATH Google Scholar
Yuan F, Ke X, Cheng E (2019) Joint representation and recognition for ship-radiated noise based on multimodal deep learning. J Mar Sci Eng 7:380
Article MATH Google Scholar
Meng Q, Yang S (2015) A wave structure based method for recognition of marine acoustic target signals[J]. J Acoust Soc Am 137(04):2242–2242
Article MATH Google Scholar
Azimi-Sadjadi MR, Yao D, Huang Q et al (2000) Underwater target classification using wavelet packets and neural networks[J]. IEEE Trans Neural Netw 11(03):784–794
Article MATH Google Scholar
Kang C, Zhang X, Zhang A et al (2004) Underwater acoustic targets classification using welch spectrum estimation and neural networks[C]. Advances in neural networks-ISNN 2004: international symposium on neural networks. Dalian, China, pp 930–935
Das A, Kumar A, Bahl R (2013) Marine vessel classification based on passive sonar data: the cepstrum-based approach[J]. IET Radar, Sonar & Navigation 7(01):87–93
Article MATH Google Scholar
Zhang L, Wu D, Han X et al (2016) Feature extraction of underwater target signal using mel frequency cepstrum coefficients based on acoustic vector sensor[J]. J Sens 7864213:1–11
MATH Google Scholar
Jahromi MS, Bagheri V, Rostami H et al (2019) Feature extraction in fractional Fourier domain for classification of passive sonar signals[J]. J Signal Process Syst 91:511–520
Article MATH Google Scholar
Mohankumar K, Supriya MH, Pillai PR (2015) Bispectral gammatone cepstral coefficient based neural network classifier[C]. In: 2015 IEEE underwater technology, Chennai, India, 2015, pp 1–5
Tuma M, Rørbech V, Prior MK et al (2016) Integrated optimization of long-range underwater signal detection, feature extraction, and classification for nuclear treaty monitoring[J]. IEEE Trans Geosci Remote Sens 54(06):3649–3659
Article MATH Google Scholar
Ke X, Yuan F, Cheng E (2020) Integrated optimization of underwater acoustic ship-radiated noise recognition based on two-dimensional feature fusion[J]. Appl Acoust 159:107057
Article MATH Google Scholar
Yang H, Gan A, Chen H et al (2016) Underwater acoustic target recognition using SVM ensemble via weighted sample and feature selection[C]. 2016 13th International Bhurban conference on applied sciences and technology. Islamabad, Pakistan, pp 522–527
Filho WS, Seixas JM, Moura NN (2011) Preprocessing passive sonar signals for neural classification[J]. IET Radar, Sonar & Navigation 5(06):605–612
Article MATH Google Scholar
Domingos LC, Santos PE, Skelton PS et al (2022) A survey of underwater acoustic data classification methods using deep learning for shoreline surveillance[J]. Sensors 22(6):2181
Article MATH Google Scholar
Zeng X-Y, Wang S-G (2013) Bark-wavelet analysis and Hilbert-Huang transform for underwater target recognition. Def Technol 9(2):115–120
Article MATH Google Scholar
ksuren IG, Hocaoglu AK (2022) Automatic Target Classification Using Underwater Acoustic Signals. In Proceedings of the 2022 30th signal processing and communications applications conference (SIU). Safranbolu, Turkey, 15–18
Wang P, Peng Y (2020) Research on feature extraction and recognition method of underwater acoustic target based on deep convolutional network. In: Proceedings of the 2020 IEEE international conference on advances in electrical engineering and computer applications (AEECA). Dalian, China, 25–27, pp 863–868
Chao-xion S (2014) Application of the lifting wavelet transform based MFCC in target identification. Tech Acoust 33:372–375
MATH Google Scholar
Teng B, Zhao H (2020) Underwater target recognition methods based on the framework of deep learning: a survey. Int J Adv Robot Syst 17:172988142097630
Article MATH Google Scholar
Zhang Q, Da L, Zhang Y, Hu Y (2021) Integrated neural networks based on feature fusion for underwater target recognition. Appl Acoust 182:108261
Article MATH Google Scholar
Desplanques B et al (2020) ECAPA-TDNN: emphasized channel attention, propagation and aggregation in TDNN based speaker verification. Interspeech 2020, https://doi.org/10.21437/interspeech.2020-2650
Hu J et al (2018) Squeeze-and-Excitation Networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition
Wang H et al (2023) CAM++: a fast and efficient network for speaker verification using context-aware masking
Yu Y-Q, Li W-J (2020) Densely connected time delay neural network for speaker verification. Interspeech
Choudhary S et al (2022) LEAN: light and efficient audio classification network. In: 2022 IEEE 19th india council international conference (INDICON). IEEE
Chen Y et al (2023) Effective audio classification network based on paired inverse pyramid structure and dense mlp block. In: Huang, DS, Premaratne P, Jin B, Qu B, Jo KH, Hussain A (eds) Advanced intelligent computing technology and applications. ICIC 2023. Lecture Notes in Computer Science, vol 14087. Springer, Singapore
Jena KK et al (2023) A hybrid deep learning approach for classification of music genres using wavelet and spectrogram analysis. Neural Computing and Applications, pp 11223–48. https://doi.org/10.1007/s00521-023-08294-6
Liu X et al (2023) Cat: Causal audio transformer for audio classification. In: ICASSP 2023-2023 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE
Zhu W, Omar M (2023) Multiscale audio spectrogram transformer for efficient audio classification
Howard A et al (2019) Searching for MobileNetV3. In: 2019 IEEE/CVF international conference on computer vision (ICCV), https://doi.org/10.1109/iccv.2019.00140
Ramachandran P, Zoph B, Le Q V (2017) Searching for activation functions[J]. arXiv:1710.05941
Castro M, Mario A, et al (2022) Prediction of speech intelligibility with dnn-based performance measures. Computer Speech & amp; Language, pp 101329, https://doi.org/10.1016/j.csl.2021.101329
Kong Q, et al (2021) PANNs: large-scale pretrained audio neural networks for audio pattern recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing
Woo, Sanghyun, et al (2023) Convnext v2: Co-designing and scaling convnets with masked autoencoders. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Touvron H, Cord M, Jégou H (2022) DeiT III: revenge of the ViT. In: Avidan S, Brostow G, Cissé M, Farinella GM, Hassner T (eds) ECCV. Springer, Cham, pp 516–533

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No. 62472220).

Author information

Authors and Affiliations

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Jiangjun Road No. 29, Nanjing, 211100, Jiangsu Province, China
Jiaqi Wu, Donghai Guan & Weiwei Yuan

Authors

Jiaqi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Donghai Guan
View author publications
You can also search for this author in PubMed Google Scholar
Weiwei Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Jiaqi Wu:Methodology, Experiment. Donghai Guan: Methodology, Writing. Weiwei Yuan: Experiment, Writing.

Corresponding author

Correspondence to Donghai Guan.

Ethics declarations

Competing of Interest

No potential conflict of interest was reported by the authors.

Ethical and informed consent for data used

In this study, we used public benchmark dataset.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wu, J., Guan, D. & Yuan, W. Echo lite voice fusion network: advancing underwater acoustic voiceprint recognition with lightweight neural architectures. Appl Intell 55, 112 (2025). https://doi.org/10.1007/s10489-024-06035-3

Download citation

Accepted: 11 September 2024
Published: 10 December 2024
DOI: https://doi.org/10.1007/s10489-024-06035-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Echo lite voice fusion network: advancing underwater acoustic voiceprint recognition with lightweight neural architectures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A novel deep transfer learning models for recognition of birds sounds in different environment

Text-Independent Voiceprints Identification Using Feed-Forward Back-Propagation with Layered Strategies

A stacked convolutional neural network framework with multi-scale attention mechanism for text-independent voiceprint recognition

Explore related subjects

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing of Interest

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation