ABSTRACT
Voice is a convenient and popular way to interact with our digital world. Besides translating speech to text, it is also possible to identify speakers based on their voice profile. To date, speaker identification has predominantly been limited to high-performance computational platforms owing to the intricate nature of the underlying algorithms. In this work, we demonstrate that it is possible to reduce model complexity by the required factor of ~10, such that speaker identification can be made feasible for embedded devices with limited resources. We further describe and discuss novel use cases, such as voice-based presence detection and authentication, that become feasible on these class of devices.
- A. Hajavi and A. Etemad. 2019. A Deep Neural Network for Short-Segment Speaker Recognition. In Proc. of Interspeech'19. Google ScholarCross Ref
- M. Jakubec et al. 2021. Speaker Recognition with ResNet and VGG Networks. In Proc. of RADIOELEKTRONIKA'19. Google ScholarCross Ref
- S. Koppula et al. 2018. Energy-Efficient Speaker Identification with Low-Precision Networks. In Proc. of ICASSP'18. Google ScholarDigital Library
- C. Nunes et al. 2020. AM-MobileNet1D: A Portable Model for Speaker Recognition. In Proc. of IJCNN'20. Google ScholarCross Ref
- S.S. Tirumala and S.R. Shahamiri. 2016. A Review on Deep Learning Approaches in Speaker Identification. In Proc. of ICSPS'16. Google ScholarDigital Library
Recommendations
Text-Independent Speaker Identification Using Vowel Formants
Automatic speaker identification has become a challenging research problem due to its wide variety of applications. Neural networks and audio-visual identification systems can be very powerful, but they have limitations related to the number of ...
Speaker Identification Using Whispered Speech
CSNT '13: Proceedings of the 2013 International Conference on Communication Systems and Network TechnologiesThe study of closed set text-independent speaker identification using whisper speech is presented in this paper. A new feature called temporal Teager energy based sub band cepstral coefficients (TTESBCC) is proposed. The work presented compares the ...
Text-Independent/Text-Prompted Speaker Recognition by Combining Speaker-Specific GMM with Speaker Adapted Syllable-Based HMM
We presented a new text-independent/text-prompted speaker recognition method by combining speaker-specific Gaussian Mixture Model (GMM) with syllable-based HMM adapted by MLLR or MAP. The robustness of this speaker recognition method for speaking style'...
Comments