research-article

Comparison of Different Models of Voiceprint Recognition used in Automatic Door Lock System (August 2021)

Authors:
Jiawei Liu

Southern University of Science and Technology(Sustech), China

Southern University of Science and Technology(Sustech), China
View Profile

,
Chenyang Jin

Shanghai YKPao School, China

Shanghai YKPao School, China
View Profile

,
Jingxi Liang

RDF International school,Shenzhen Guangdong, China

RDF International school,Shenzhen Guangdong, China
View Profile

,
Luoqi Wang

Shenzhen Middle School, China

Shenzhen Middle School, China
View Profile

DMIP '21: Proceedings of the 2021 4th International Conference on Digital Medicine and Image ProcessingNovember 2021Pages 54–60https://doi.org/10.1145/3506651.3506660

Published:21 February 2022Publication History

DMIP '21: Proceedings of the 2021 4th International Conference on Digital Medicine and Image Processing

Pages 54–60

ABSTRACT

For any system, its reliability and the cost of construction have always been two major determinants of whether it can be used daily. In the field of voiceprint recognition, people are often forced to choose between accuracy and convenience. This paper discusses the performance of two speaker verification models in different environment and whether it is possible to balance between the cost and the result. The Gaussian Mixture Model with universal background model (GMM-UBM) and deep-learning method are selected to represent two common approaches in speaker verification. Through comparison between the two models, we find that the deep-learning method is in greater need of large training datasets to function since it performs poorer than the GMM-UBM model while trained with the same dataset containing only a few samples, while both of these two methods reach nearly 100% accuracy if provided a large enough dataset to train the model. Meanwhile, despite the attempt to yield higher accuracy by configuring the setting of both models, it appears that excellent performance only occurs when large amounts of training data are given, and little noise is present.

References

M. Jian and L. Yongmei, “An embedded voiceprint recognition system based on GMM,” 10th Int. Conf. Comput. Sci. Educ. ICCSE 2015, vol. 1, no. Iccse, pp. 38–41, 2015, doi: 10.1109/ICCSE.2015.7250214.Google ScholarCross Ref
F. Bimbot , “A tutorial on text-independent speaker verification,” EURASIP J. Appl. Signal Processing, vol. 2004, no. 4, pp. 430–451, 2004, doi: 10.1155/S1110865704310024.Google ScholarDigital Library
M. Slaney, “MSR Identity Toolbox,” 2013. .Google Scholar
S. Furui, L. Deng, M. Gales, H. Ney, and K. Tokuda, “Fundamental technologies in modern speech recognition,” IEEE Signal Process. Mag., vol. 29, no. 6, pp. 16–17, 2012, doi: 10.1109/MSP.2012.2209906.Google ScholarCross Ref
M. Xu, L. Y. Duan, J. Cai, L. T. Chia, C. Xu, and Q. Tian, “HMM-based audio keyword generation,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 3333, pp. 566–574, 2004, doi: 10.1007/978-3-540-30543-9_71.Google ScholarDigital Library
J. Zhang, “Realization and improvement algorithm of GMM - UBM model in voiceprint recognition,” Proc. 30th Chinese Control Decis. Conf. CCDC 2018, pp. 2989–2992, 2018, doi: 10.1109/CCDC.2018.8407636.Google ScholarCross Ref
Y. Xue, L. Wang, L. Li, Z. Liu, and J. Liu, “Matlab-based intelligent voiceprint recognition system,” Proc. - 2016 6th Int. Conf. Instrum. Meas. Comput. Commun. Control. IMCCC 2016, pp. 303–306,Google Scholar

Recommendations

A Comparison of MFCC and LPCC with Deep Learning for Speaker Recognition
ICBDC '19: Proceedings of the 4th International Conference on Big Data and Computing

The biological information includes a fingerprint, an iris, a face, a vein, a voice. Among them, since the voice is not touched directly, the psychological burden on the user at the time of input is small as compared with other biological information. ...
Read More
An Automatic Qari Recognition System
ACSAT '12: Proceedings of the 2012 International Conference on Advanced Computer Science Applications and Technologies

In this paper, we present an automatic Qari Recognition system based on text-independent speaker recognition technique using Mel-Frequency Cepstral Coefficients. Our test database of 200 samples consisted of recordings by 20 reciters and we achieved ...
Read More
In-Set/Out-of-Set Speaker Recognition Under Sparse Enrollment

In this paper, the problem of identifying in-set versus out-of-set speakers using extremely limited enrollment data is addressed. The recognition objective is to form a binary decision regarding an input speaker as being a legitimate member of a set of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

DMIP '21: Proceedings of the 2021 4th International Conference on Digital Medicine and Image Processing
November 2021
87 pages
ISBN:9781450386487
DOI:10.1145/3506651

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 February 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Deep learning
Performance comparison
Speaker recognition
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 34
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Comparison of Different Models of Voiceprint Recognition used in Automatic Door Lock System (August 2021)

DMIP '21: Proceedings of the 2021 4th International Conference on Digital Medicine and Image Processing

ABSTRACT

References

Cited By

Recommendations

A Comparison of MFCC and LPCC with Deep Learning for Speaker Recognition

An Automatic Qari Recognition System

In-Set/Out-of-Set Speaker Recognition Under Sparse Enrollment

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Comparison of Different Models of Voiceprint Recognition used in Automatic Door Lock System (August 2021)

DMIP '21: Proceedings of the 2021 4th International Conference on Digital Medicine and Image Processing

ABSTRACT

References

Cited By

Recommendations

A Comparison of MFCC and LPCC with Deep Learning for Speaker Recognition

An Automatic Qari Recognition System

In-Set/Out-of-Set Speaker Recognition Under Sparse Enrollment

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media