Skip to main content

Advertisement

Echo lite voice fusion network: advancing underwater acoustic voiceprint recognition with lightweight neural architectures

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Underwater acoustic voiceprint recognition, serving as a key technology in the field of biometric identification, presents a wide range of application prospects, especially in areas such as marine resource development, underwater communication, and underwater safety monitoring. Conventional acoustic voiceprint recognition methods exhibit limitations in underwater environments, prompting the need for a lightweight neural network approach to optimally address underwater acoustic voiceprint recognition tasks. This paper introduces a novel lightweight voicing recognition model, the Echo Lite Voice Fusion Network (ELVFN), which incorporates depthwise separable convolution and self-attention mechanism, and significantly improves voicing recognition performance by optimizing acoustic feature extraction technology and hierarchical feature fusion strategy. Concurrently, the computational complexity and parameter quantity of the model are substantially reduced. Comparative analyses with existing acoustic voiceprint recognition models corroborate the superior performance of our model across multiple underwater acoustic datasets. Experimental results demonstrate that ELVFN outperforms in various evaluation metrics, notably in terms of processing efficiency and recognition accuracy. Finally, we discuss the application potential and future development directions of the model, providing an efficient solution for underwater acoustic voiceprint recognition in resource-constrained environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

The data that support the findings of this study are available from the corresponding author, Guan, upon reasonable request.

References

  1. Urick RJ (1983) Principles of Underwater Sound; McGraw-Hill Book Co: Los Angeles. CA, USA, p 423p

  2. Kenny AJ, Cato I, Desprez M, Fader G, Schüttenhelm RTE, Side J (2003) An overview of seabed-mapping technologies in the context of marine habitat classification. ICES J Mar Sci 60:411–418

    Article  Google Scholar 

  3. Testolin A, Kipnis D, Diamant R (2022) Detecting submerged objects using active acoustics and deep neural networks: a test case for pelagic fish. IEEE Trans Mob Comput 21:2776–2788

    Article  MATH  Google Scholar 

  4. Testolin A, Diamant R (2020) Combining denoising autoencoders and dynamic programming for acoustic detection and tracking of underwater moving targets. Sensors 20:2945

    Article  MATH  Google Scholar 

  5. Yuan F, Ke X, Cheng E (2019) Joint representation and recognition for ship-radiated noise based on multimodal deep learning. J Mar Sci Eng 7:380

    Article  MATH  Google Scholar 

  6. Meng Q, Yang S (2015) A wave structure based method for recognition of marine acoustic target signals[J]. J Acoust Soc Am 137(04):2242–2242

    Article  MATH  Google Scholar 

  7. Azimi-Sadjadi MR, Yao D, Huang Q et al (2000) Underwater target classification using wavelet packets and neural networks[J]. IEEE Trans Neural Netw 11(03):784–794

    Article  MATH  Google Scholar 

  8. Kang C, Zhang X, Zhang A et al (2004) Underwater acoustic targets classification using welch spectrum estimation and neural networks[C]. Advances in neural networks-ISNN 2004: international symposium on neural networks. Dalian, China, pp 930–935

  9. Das A, Kumar A, Bahl R (2013) Marine vessel classification based on passive sonar data: the cepstrum-based approach[J]. IET Radar, Sonar & Navigation 7(01):87–93

    Article  MATH  Google Scholar 

  10. Zhang L, Wu D, Han X et al (2016) Feature extraction of underwater target signal using mel frequency cepstrum coefficients based on acoustic vector sensor[J]. J Sens 7864213:1–11

    MATH  Google Scholar 

  11. Jahromi MS, Bagheri V, Rostami H et al (2019) Feature extraction in fractional Fourier domain for classification of passive sonar signals[J]. J Signal Process Syst 91:511–520

    Article  MATH  Google Scholar 

  12. Mohankumar K, Supriya MH, Pillai PR (2015) Bispectral gammatone cepstral coefficient based neural network classifier[C]. In: 2015 IEEE underwater technology, Chennai, India, 2015, pp 1–5

  13. Tuma M, Rørbech V, Prior MK et al (2016) Integrated optimization of long-range underwater signal detection, feature extraction, and classification for nuclear treaty monitoring[J]. IEEE Trans Geosci Remote Sens 54(06):3649–3659

    Article  MATH  Google Scholar 

  14. Ke X, Yuan F, Cheng E (2020) Integrated optimization of underwater acoustic ship-radiated noise recognition based on two-dimensional feature fusion[J]. Appl Acoust 159:107057

    Article  MATH  Google Scholar 

  15. Yang H, Gan A, Chen H et al (2016) Underwater acoustic target recognition using SVM ensemble via weighted sample and feature selection[C]. 2016 13th International Bhurban conference on applied sciences and technology. Islamabad, Pakistan, pp 522–527

  16. Filho WS, Seixas JM, Moura NN (2011) Preprocessing passive sonar signals for neural classification[J]. IET Radar, Sonar & Navigation 5(06):605–612

    Article  MATH  Google Scholar 

  17. Domingos LC, Santos PE, Skelton PS et al (2022) A survey of underwater acoustic data classification methods using deep learning for shoreline surveillance[J]. Sensors 22(6):2181

    Article  MATH  Google Scholar 

  18. Zeng X-Y, Wang S-G (2013) Bark-wavelet analysis and Hilbert-Huang transform for underwater target recognition. Def Technol 9(2):115–120

    Article  MATH  Google Scholar 

  19. ksuren IG, Hocaoglu AK (2022) Automatic Target Classification Using Underwater Acoustic Signals. In Proceedings of the 2022 30th signal processing and communications applications conference (SIU). Safranbolu, Turkey, 15–18

  20. Wang P, Peng Y (2020) Research on feature extraction and recognition method of underwater acoustic target based on deep convolutional network. In: Proceedings of the 2020 IEEE international conference on advances in electrical engineering and computer applications (AEECA). Dalian, China, 25–27, pp 863–868

  21. Chao-xion S (2014) Application of the lifting wavelet transform based MFCC in target identification. Tech Acoust 33:372–375

    MATH  Google Scholar 

  22. Teng B, Zhao H (2020) Underwater target recognition methods based on the framework of deep learning: a survey. Int J Adv Robot Syst 17:172988142097630

    Article  MATH  Google Scholar 

  23. Zhang Q, Da L, Zhang Y, Hu Y (2021) Integrated neural networks based on feature fusion for underwater target recognition. Appl Acoust 182:108261

    Article  MATH  Google Scholar 

  24. Desplanques B et al (2020) ECAPA-TDNN: emphasized channel attention, propagation and aggregation in TDNN based speaker verification. Interspeech 2020, https://doi.org/10.21437/interspeech.2020-2650

  25. Hu J et al (2018) Squeeze-and-Excitation Networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition

  26. Wang H et al (2023) CAM++: a fast and efficient network for speaker verification using context-aware masking

  27. Yu Y-Q, Li W-J (2020) Densely connected time delay neural network for speaker verification. Interspeech

  28. Choudhary S et al (2022) LEAN: light and efficient audio classification network. In: 2022 IEEE 19th india council international conference (INDICON). IEEE

  29. Chen Y et al (2023) Effective audio classification network based on paired inverse pyramid structure and dense mlp block. In: Huang, DS, Premaratne P, Jin B, Qu B, Jo KH, Hussain A (eds) Advanced intelligent computing technology and applications. ICIC 2023. Lecture Notes in Computer Science, vol 14087. Springer, Singapore

  30. Jena KK et al (2023) A hybrid deep learning approach for classification of music genres using wavelet and spectrogram analysis. Neural Computing and Applications, pp 11223–48. https://doi.org/10.1007/s00521-023-08294-6

  31. Liu X et al (2023) Cat: Causal audio transformer for audio classification. In: ICASSP 2023-2023 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE

  32. Zhu W, Omar M (2023) Multiscale audio spectrogram transformer for efficient audio classification

  33. Howard A et al (2019) Searching for MobileNetV3. In: 2019 IEEE/CVF international conference on computer vision (ICCV), https://doi.org/10.1109/iccv.2019.00140

  34. Ramachandran P, Zoph B, Le Q V (2017) Searching for activation functions[J]. arXiv:1710.05941

  35. Castro M, Mario A, et al (2022) Prediction of speech intelligibility with dnn-based performance measures. Computer Speech & amp; Language, pp 101329, https://doi.org/10.1016/j.csl.2021.101329

  36. Kong Q, et al (2021) PANNs: large-scale pretrained audio neural networks for audio pattern recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing

  37. Woo, Sanghyun, et al (2023) Convnext v2: Co-designing and scaling convnets with masked autoencoders. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

  38. Touvron H, Cord M, Jégou H (2022) DeiT III: revenge of the ViT. In: Avidan S, Brostow G, Cissé M, Farinella GM, Hassner T (eds) ECCV. Springer, Cham, pp 516–533

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No. 62472220).

Author information

Authors and Affiliations

Authors

Contributions

Jiaqi Wu:Methodology, Experiment. Donghai Guan: Methodology, Writing. Weiwei Yuan: Experiment, Writing.

Corresponding author

Correspondence to Donghai Guan.

Ethics declarations

Competing of Interest

No potential conflict of interest was reported by the authors.

Ethical and informed consent for data used

In this study, we used public benchmark dataset.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, J., Guan, D. & Yuan, W. Echo lite voice fusion network: advancing underwater acoustic voiceprint recognition with lightweight neural architectures. Appl Intell 55, 112 (2025). https://doi.org/10.1007/s10489-024-06035-3

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-06035-3

Keywords