Robust feature extraction from spectrum estimated using bispectrum for speaker recognition

Ajmera, Pawan K.; Nehe, Navnath S.; Jadhav, Dattatray V.; Holambe, Raghunath S.

doi:10.1007/s10772-012-9153-5

Robust feature extraction from spectrum estimated using bispectrum for speaker recognition

Published: 06 June 2012

Volume 15, pages 433–440, (2012)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Pawan K. Ajmera¹,
Navnath S. Nehe²,
Dattatray V. Jadhav¹ &
…
Raghunath S. Holambe³

349 Accesses
Explore all metrics

Abstract

Extraction of robust features from noisy speech signals is one of the challenging problems in speaker recognition. As bispectrum and all higher order spectra for Gaussian process are identically zero, it removes the additive white Gaussian noise while preserving the magnitude and phase information of original signal. The spectrum of original signal can be recovered from its noisy version using this property. Robust Mel Frequency Cepstral Coefficients (MFCC) are extracted from the estimated spectral magnitude (denoted as Bispectral-MFCC (BMFCC)). The effectiveness of BMFCC has been tested on TIMIT and SGGS databases in noisy environment. The proposed BMFCC features yield 95.30 %, 97.26 % and 94.22 % speaker recognition rate on TIMIT, SGGS and SGGS2 databases, respectively for 20 dB SNR whereas these values for 0 dB SNR are 45.84 %, 50.79 % and 44.98 %. The experimental results show the superiority of the proposed technique compared to conventional methods for all databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Speaker Recognition Using Improved GFCC and Adaptive Feature Selection

A Feature Level Fusion Scheme for Robust Speaker Identification

Analysis of Speaker’s Voice in Cepstral Domain Using MFCC Based Feature Extraction and VQ Technique for Speaker Identification System

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Boll, S. F. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustic Speech and Signal Processing, 27, 113–120.
Article Google Scholar
Chandran, V., & Elgar, S. L. (1993). Pattern recognition using invariants defined from higher order spectra-one-dimensional inputs. IEEE Transactions on Signal Processing, 41(1), 205–212.
Article MATH Google Scholar
Chen, J., Paliwal, K. K., & Nakamura, S. (2003). Cepstrum derived from differentiated power spectrum for robust speech recognition. Speech Communication, 41, 469–484.
Article Google Scholar
Davis, S. B., & Mermelstine, P. (1980). Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Transaction on Acoustic Speech and Signal Processing, 28, 357–366.
Article Google Scholar
Fulchiero, R., & Spanias, A. S. (1993). Speech enhancement using the bispectrum. In IEEE ICASSP proceedings, Minnesota (pp. 488–491).
Google Scholar
Furui, S. (1981). Cepstral analysis technique for automatic speaker verification. IEEE Transaction on Acoustic Speech and Signal Processing, 29, 256–272.
Google Scholar
Gales, M. J. F., & Young, S. J. (1996). Robust speech recognition using parallel model combination. IEEE Transactions on Speech and Audio Processing, 4, 352–359.
Article Google Scholar
Hariharan, R., Kiss, I., & Viikki, O. (2001). Noise robust speech parameterization using multiresolutaion feature extraction. IEEE Transactions on Speech and Audio Processing, 9(8), 856–865.
Article Google Scholar
Harmansky, H. (1990). Perceptual linear predictive (PLP) analysis of speech. Journal of Acoustic Society of America, 87(4), 1738–1752.
Article Google Scholar
Harmansky, H., & Morgan, N. (1994). RASTA processing of speech. IEEE Transactions on Speech and Audio Processing, 2, 578–589.
Article Google Scholar
Holambe, R. S., Ray, A. K., & Basu, T. K. (1996). Phase-only blind deconvolution using bicepstrum iterative reconstruction algorithm (BIRA). IEEE Transactions on Signal Processing, 44(9), 2356–2359.
Article Google Scholar
Huber, P. J., Kleiner, B., Gasser, T., & Dumermuth, G. (1971). Statistical methods for investigating phase relations in stationary stochastic processes. IEEE Transactions on Audio and Electroacoustics, 19(1), 78–86.
Article Google Scholar
Kaiser, J. F. (1990). On a simple algorithm to calculate the ‘energy’ of a signal. In IEEE ICASSP proceedings, Albuquerque, New Mexico (pp. 381–384).
Google Scholar
Kotnik, B., & Kačič, Z. (2007). A comprehensive noise robust speech parameterization algorithm using wavelet packet decomposition-based denoising and speech feature representation techniques. EURASIP Journal on Advances in Signal Processing, 1, 1–20.
Google Scholar
Lookwood, P., & Boudy, J. (1992). Experiments with nonlinear speech subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars. Speech Communication, 11, 215–228.
Article Google Scholar
Navarro-Mesa, J., Moreno-Bilbao, A., & Lleida-Solano, E. (1999). An improved speech endpoint detection system in noisy environments by means of third-order spectra. IEEE Signal Processing Letters, 6(9), 224–226.
Article Google Scholar
Nikias, C. L., & Raghuveer, M. R. (1987). Bispectrum estimation: A digital signal processing framework. IEEE Proceedings, 75(7), 869–891.
Article Google Scholar
Oppenheim, A. V., & Schafer, R. W. (1997). Cepstrum analysis and homomorphic deconvolution. In Discrete-time signal processing (4th ed., pp. 768–834). Englewood Cliffs: Prentics-Hall.
Google Scholar
Raghuveer, M. R., & Nikias, C. L. (1985). Bispectrum estimation: A parametric approach. IEEE Transactions on Acoustics Speech and Signal Processing, 23(4), 1213–1230.
Article Google Scholar
Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Transactions Speech and Audio Processing, 3(1), 72–82.
Article Google Scholar
Sasaki, K., Sato, T., & Yamashita, Y. (1977). Holographic passive sonar. IEEE Transactions on Sonics and Ultrasonics, 24(3), 193–200.
Article Google Scholar
Seetharaman, S., & Jernigan, M. E. (1988). Speech signal reconstruction based on higher order spectra. In IEEE ICASSP proceedings, New York (pp. 703–706).
Google Scholar
Sundaramoorthy, G., Raghuveer, M. R., & Dianat, S. A. (1990). Bispectral reconstruction of signals in noise: amplitude reconstruction issues. IEEE Transactions on Acoustics, Speech and Signal Processing, 38(7), 1297–1306.
Article Google Scholar
Viikki, O., Bye, D., & Laurila, K. (1998). A recursive feature vector normalization approach for robust speech recognition noise. In IEEE ICASSP proceedings, Seattle, WA (pp. 733–736).
Google Scholar
Xu, J., & Wei, G. (2000). Noise-robust speech recognition based on difference of power spectrum. Electronics Letters, 36(14), 1247–1248.
Article Google Scholar

Download references

Author information

Authors and Affiliations

TSSM’s Bhivarabai Sawant College of Engineering and Research, Pune, MS, India
Pawan K. Ajmera & Dattatray V. Jadhav
J.S.P.M. Narhe Technical Campus, Rajarshi Shau School of Engineering and Research, Pune, MS, India
Navnath S. Nehe
S.G.G.S. Institute Engineering & Technology, Vishnupuri, Nanded, MS, India
Raghunath S. Holambe

Authors

Pawan K. Ajmera
View author publications
You can also search for this author in PubMed Google Scholar
Navnath S. Nehe
View author publications
You can also search for this author in PubMed Google Scholar
Dattatray V. Jadhav
View author publications
You can also search for this author in PubMed Google Scholar
Raghunath S. Holambe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pawan K. Ajmera.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ajmera, P.K., Nehe, N.S., Jadhav, D.V. et al. Robust feature extraction from spectrum estimated using bispectrum for speaker recognition. Int J Speech Technol 15, 433–440 (2012). https://doi.org/10.1007/s10772-012-9153-5

Download citation

Received: 15 March 2012
Accepted: 26 May 2012
Published: 06 June 2012
Issue Date: September 2012
DOI: https://doi.org/10.1007/s10772-012-9153-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust feature extraction from spectrum estimated using bispectrum for speaker recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Robust Speaker Recognition Using Improved GFCC and Adaptive Feature Selection

A Feature Level Fusion Scheme for Robust Speaker Identification

Analysis of Speaker’s Voice in Cepstral Domain Using MFCC Based Feature Extraction and VQ Technique for Speaker Identification System

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now