research-article

Research on Speaker Recognition Technology Based on Feature Model

Authors:
Haoyu Jiang

Northwest Minzu University, China

Northwest Minzu University, China
View Profile

,
Hongzhi Yu

Northwest Minzu University, China

Northwest Minzu University, China
View Profile

IPEC '22: Proceedings of the 3rd Asia-Pacific Conference on Image Processing, Electronics and ComputersApril 2022Pages 327–330https://doi.org/10.1145/3544109.3544169

Published:18 July 2022Publication History

IPEC '22: Proceedings of the 3rd Asia-Pacific Conference on Image Processing, Electronics and Computers

Pages 327–330

ABSTRACT

Speaker recognition, also known as voiceprint recognition, as the name implies, is to identify "who is speaking" by sound, and is a biometric identification technology that identifies the speaker's identity based on the speaker's personality information in the voice signal. In this paper, through a survey of speaker recognition literature and related technologies, the two main tasks of speaker recognition, speaker confirmation and speaker recognition, are introduced, and some models in the development of speaker recognition technology are introduced. From the early Gaussian Mixture Model-Universal Background Model, to Joint Factor Analysis and I-vector model, to the emergence of various new feature models combined with deep learning, the recognition effect is getting better and better. Recognizable scenarios are also becoming more complex. Finally, the speaker recognition technology is summarized and its future research is prospected.

References

D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, “Speaker verification using adapted Gaussian mixture models,” Digital Signal Process., vol. 10, no. 1–3, pp. 19–41, Jan. 2000.Google ScholarDigital Library
Campbell W M, Sturim D E, Reynolds D A. Support vector machines using GMM supervectors for speaker verification. IEEE signal processing letters, 2006, 13(5): 308-311.Google ScholarCross Ref
Dehak N, Dumouchel P, Kenny P. Modeling prosodic features with joint factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(7): 2095-2103.Google ScholarDigital Library
Dehak N, Kenny P, Dehak R, Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(4): 788-798.Google ScholarDigital Library
Variani E, Lei X, McDermott E, Deep neural networks for small footprint text-dependent speaker verification . IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014: 4052-4056.Google Scholar
Chen Y, Lopez-Moreno I, Sainath T N, Locally-connected and convolutional neural networks for small footprint speaker recognition//Sixteenth Annual Conference of the International Speech Communication Association. 2015.Google Scholar
Snyder D, Garcia-Romero D, Povey D, Deep Neural Network Embeddings for Text-Independent Speaker Verification//Interspeech. 2017: 999-1003.Google Scholar
Snyder D, Garcia-Romero D, Sell G, X-vectors: Robust dnn embeddings for speaker recognition//2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 2018: 5329-5333.Google Scholar
Doddington G R. Speaker recognition—Identifying people by their voices. Proceedings of the IEEE, 1985, 73(11): 1651-1664.Google ScholarCross Ref
Saquib Z, Salam N, Nair R P, A survey on automatic speaker recognition systems. Signal Processing and Multimedia, 2010: 134-145.Google Scholar
Kinnunen T, Li H. An overview of text-independent speaker recognition: From features to supervectors. Speech communication, 2010, 52(1): 12-40.Google Scholar
Hansen J H L, Hasan T. Speaker recognition by machines and humans: A tutorial review. IEEE Signal processing magazine, 2015, 32(6): 74-99.Google Scholar
Gehring J, Miao Y, Metze F, Extracting deep bottleneck features using stacked auto-encoders//2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013: 3377-3381.Google Scholar
Chen N, Qian Y, Yu K. Multi-task learning for text-dependent speaker verification//Sixteenth annual conference of the international speech communication association. 2015.Google Scholar
Yuan X, Li G, Han J, Overview of the development of speaker recognition//Journal of Physics: Conference Series. IOP Publishing, 2021, 1827(1): 012125.Google Scholar

Recommendations

Text-Independent/Text-Prompted Speaker Recognition by Combining Speaker-Specific GMM with Speaker Adapted Syllable-Based HMM

We presented a new text-independent/text-prompted speaker recognition method by combining speaker-specific Gaussian Mixture Model (GMM) with syllable-based HMM adapted by MLLR or MAP. The robustness of this speaker recognition method for speaking style'...
Read More
Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification

In the European Telecommunication Standards Institute (ETSI), Distributed Speech Recognition (DSR) front-end, the distortion added due to feature compression on the front end side increases the variance flooring effect, which in turn increases the ...
Read More
Speaker identification using multi-modal i-vector approach for varying length speech in voice interactive systems
Abstract
The development in the interface of smart devices has lead to voice interactive systems. An additional step in this direction is to enable the devices to recognize the speaker. But this is a challenging task because the interaction ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

IPEC '22: Proceedings of the 3rd Asia-Pacific Conference on Image Processing, Electronics and Computers
April 2022
1065 pages
ISBN:9781450395786
DOI:10.1145/3544109

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 July 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Biometric recognition
Feature model
Speaker identification
Speaker verification
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 45
  Total Downloads
- Downloads (Last 12 months)12
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Research on Speaker Recognition Technology Based on Feature Model

IPEC '22: Proceedings of the 3rd Asia-Pacific Conference on Image Processing, Electronics and Computers

ABSTRACT

References

Cited By

Recommendations

Text-Independent/Text-Prompted Speaker Recognition by Combining Speaker-Specific GMM with Speaker Adapted Syllable-Based HMM

Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification

Speaker identification using multi-modal i-vector approach for varying length speech in voice interactive systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Research on Speaker Recognition Technology Based on Feature Model

IPEC '22: Proceedings of the 3rd Asia-Pacific Conference on Image Processing, Electronics and Computers

ABSTRACT

References

Cited By

Recommendations

Text-Independent/Text-Prompted Speaker Recognition by Combining Speaker-Specific GMM with Speaker Adapted Syllable-Based HMM

Effects of Phoneme Type and Frequency on Distributed Speaker Identification and Verification

Speaker identification using multi-modal i-vector approach for varying length speech in voice interactive systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media