Higher order information set based features for text-independent speaker identification

Medikonda, Jeevan; Madasu, Hanmandlu

doi:10.1007/s10772-017-9472-7

Higher order information set based features for text-independent speaker identification

Published: 27 November 2017

Volume 21, pages 451–461, (2018)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

252 Accesses
Explore all metrics

Abstract

In this paper Type-2 Information Set (T2IS) features and Hanman Transform (HT) features as Higher Order Information Set (HOIS) based features are proposed for the text independent speaker recognition. The speech signals of different speakers represented by Mel Frequency Cepstral Coefficients (MFCC) are converted into T2IS features and HT features by taking account of the cepstral and temporal possibilistic uncertainties. The features are classified by Improved Hanman Classifier (IHC), Support Vector Machine (SVM) and k-Nearest Neighbours (kNN). The performance of the proposed approaches is tested in terms of speed, computational complexity, memory requirement and accuracy on three datasets namely NIST-2003, VoxForge 2014 speech corpus and VCTK speech corpus and compared with that of the baseline features like MFCC, ∆MFCC, ∆∆MFCC and GFCC under white Gaussian noisy environment at different signal-to-noise ratios. The proposed features have the reduced feature size, computational time, and complexity and also their performance is not degraded under the noisy environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speaker Identification for OFDM-Based Aeronautical Communication System

Article 14 January 2019

A Feature Level Fusion Scheme for Robust Speaker Identification

An Efficient Feature Fusion Technique for Text-Independent Speaker Identification and Verification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Aggarwal, M., & Hanmandlu, M. (2015). Representing uncertainty with information sets. IEEE Transactions on Fuzzy Systems, 24, 1–15.
Article Google Scholar
Chang, C.-C., & Lin, C.-J., LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 1–27, 2011.
Article Google Scholar
Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing, 28, 357–366.
Article Google Scholar
Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech and Signal Processing, 32, 1109–1121.
Article Google Scholar
Hanmandlu, M. (2011). Information sets and information processing. Defence Science Journal, 61, 405–407.
Article Google Scholar
Hanmandlu, M., & Das, A. (2011). Content-based image retrieval by information theoretic measure. Defence Science Journal, 61, 415–430.
Article Google Scholar
Hermansky, H., & Morgan, N. (1994). RASTA processing of speech. IEEE Transactions on Speech and Audio Processing, 2, 578–589.
Article Google Scholar
Jawarkar, N. P., Holambe, R. S., & Basu, T. K., Use of fuzzy min-max neural network for speaker identification, In 2011 International Conference on Recent Trends in Information Technology (ICRTIT), 2011, pp. 178–182.
Jayanna, H. S., & Prasanna, S. R., & Mahadeva. (2009, Multiple frame size and rate analysis for speaker recognition under limited data condition. IET Signal Processing, 3(3), 189–204.
Article MATH Google Scholar
Kumar, K., Kim, C. & Stern, R. M., Delta-spectral cepstral coefficients for robust speech recognition, In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, pp. 4784–4787.
Lee, K. Y. (2004). Local fuzzy PCA based GMM with dimension reduction on speaker identification. Pattern Recognition Letters, 25, 1811–1817.
Article Google Scholar
Lung, S.-Y. (2004). Further reduced form of wavelet feature for text independent speaker recognition. Pattern Recognition, 37, 1565–1566.
Article MATH Google Scholar
Lung, S.-Y. (2004). Adaptive fuzzy wavelet algorithm for text-independent speaker recognition. Pattern Recognition, 37, 2095–2096.
Article Google Scholar
Mamta, & Hanmandlu, M. (2014). Robust authentication using the unconstrained infrared face images. Expert Systems with Applications, 41, 6494–6511.
Article Google Scholar
Mamta, & Hanmandlu, M. (2014). A new entropy function and a classifier for thermal face recognition. Engineering Applications of Artificial Intelligence, 36, 269–286.
Article Google Scholar
Medikonda, J., Madasu, H., & Panigrahi, B. K. (2016). Information set based gait authentication system. Neurocomputing, 207, 1–14.
Article Google Scholar
Mirhassani, S. M., & Ting, H.-N. (2014). Fuzzy-based discriminative feature representation for children’s speech recognition. Digital Signal Processing, 31, 102–114.
Article Google Scholar
NIST (2003). The NIST year 2003 speaker recognition evaluation plan. Available: http://www.itl.nist.gov/iad/mig/tests/sre/2003/2003-spkrec-evalplan-v2.2.pdf.
Pelecanos, J., & Sridharan, S. (2001). Feature Warping for Robust Speaker Verification, presented at the A Speaker Odyssey—The Speaker Recognition Workshop, Crete.
Pinheiro, H. N. B., Vieira, S. R. F., Ren, T. I., Cavalcanti, G. D. C., & de Mattos Neto, P. S. G. (2016). Type-2 fuzzy GMM for text-independent speaker verification under unseen noise conditions, In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5490–5494.
Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing, 3, 72–83.
Article Google Scholar
Reynolds, D. A. (1995). Speaker identification and verification using Gaussian mixture speaker models. Speech Communication, 17, 91–108.
Article Google Scholar
Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted gaussian mixture models. Digital Signal Processing, 10, 19–41.
Article Google Scholar
Sohn, J., Kim, N. S., Sung, W. (1999). A statistical model-based voice activity detection”. IEEE Signal Processing Letters, 6, 1–3.
Article Google Scholar
Togneri, R., & Pullella, D. (2011). An overview of speaker identification: accuracy and robustness issues. IEEE Transactions on Circuits and Systems Magazine, 11, 23–61.
Article Google Scholar
VCTK (2009). The Centre for Speech Technology Research VCTK Corpus.
VoxForge (2015). VoxForge speech corpus. Available: http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Main/.
Wang, Y., Liu, X., Xing, Y., & Li, M. (2008). A Novel Reduction Method for Text-Independent Speaker Identification,” in 2008 Fourth International Conference on Natural Computation, pp. 66–70.
Yuan, Z. X., Yu, C. Z., & Fang, Y. (1993). Text independent speaker identification using fuzzy mathematical algorithm, In 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP-93, Vol. 2., pp. 403–406.
Zhao X., & Wang D. L. (2013). Analyzing noise robustness of MFCC and GFCC features in speaker identification, In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7204–7208.
Zhao X., Shao Y., Wang D. L. (2012). CASA-based robust speaker identification. IEEE Transactions on Audio, Speech, and Language Processing, 20, 1608–1616.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biomedical Engineering, Manipal University, Manipal, Karnataka, 576104, India
Medikonda Jeevan
Department of Electrical Engineering, IIT Delhi, New Delhi, 110016, India
Madasu Hanmandlu

Authors

Medikonda Jeevan
View author publications
You can also search for this author inPubMed Google Scholar
Madasu Hanmandlu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Medikonda Jeevan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Medikonda, J., Madasu, H. Higher order information set based features for text-independent speaker identification. Int J Speech Technol 21, 451–461 (2018). https://doi.org/10.1007/s10772-017-9472-7

Download citation

Received: 06 July 2017
Accepted: 02 November 2017
Published: 27 November 2017
Issue Date: September 2018
DOI: https://doi.org/10.1007/s10772-017-9472-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Higher order information set based features for text-independent speaker identification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Speaker Identification for OFDM-Based Aeronautical Communication System

A Feature Level Fusion Scheme for Robust Speaker Identification

An Efficient Feature Fusion Technique for Text-Independent Speaker Identification and Verification

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now