Target speech feature extraction using non-parametric correlation coefficient

Oh, Sang Yeob; Chung, Kyung-Yong

doi:10.1007/s10586-013-0284-5

Target speech feature extraction using non-parametric correlation coefficient

Published: 19 June 2013

Volume 17, pages 893–899, (2014)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Sang Yeob Oh¹ &
Kyung-Yong Chung²

519 Accesses
35 Citations
Explore all metrics

Abstract

Speech recognition systems for the automobile have a few weaknesses, including failure to recognize speech due to the mixing of environment noise from inside and outside the car and from other voices. Therefore, this paper features a technique for extracting only the selected target voice from input sound that is a mixture of voices and noises. The feature for selective speech extraction composes a correlation map of auditory elements by using similarity between channels and continuity of time, and utilizes a method of extracting speech features by using a non-parametric correlation coefficient. This proposed method was validated by showing that the average distortion of separation of the technique decreased by 0.8630 dB. It was shown that the performance of the selective feature extraction utilizing a cross correlation is good, but overall, the selective feature extraction utilizing a non-parametric correlation is better.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Dupont, S., Luettin, J.: Audio-visual speech modelling for continuous speech recognition. IEEE Trans. Multimed. 2(3), 141–151 (2000)
Article Google Scholar
Gowdy, J.N., Subramanya, A., Bartels, C., Bilmes, J.: DBN-based muti-stream models for audio-visual speech recognition. In: Proc. IEEE International Conference Acoustics, Speech, and Signal Processing, pp. 993–996 (2004)
Google Scholar
Bilmes, J.A., Bartels, C.: Graphical model architectures for speech recognition. IEEE Signal Process. Mag. 22, 89–100 (2005)
Article Google Scholar
Schwartz, J.-L., Berthommier, F., Savariaux, C.: Seeing to hear better: evidence for early audio-visual interactions in speech identification. ERIC J. Rep.-Res. Cogn. 93(2), 69–78 (2004)
Google Scholar
Chibelushi, C.C., Deravi, F., Moson, J.S.: A review of speech-based bimodal recognition. IEEE Trans. Multimed. 4(1), 23–37 (2002)
Article Google Scholar
Pham, T.T., Kim, J.Y., Na, S.Y., Hwang, S.T.: Robust eye localization for lip reading in mobile environment. In: Proc. of SCIS&ISIS, Japan, pp. 385–388 (2008)
Google Scholar
Pham, T.T., Song, M.G., Kim, J.Y., Na, S.Y., Hwang, S.T.: A robust lip center detection in cell phone environment. In: Proc. of IEEE Symposium on Signal Processing and Information Technology, Sarajevo, pp. 390–395 (2008)
Google Scholar
Hu, G., Wang, D.L.: Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Trans. Neural Netw. 15, 1135–1150 (2004)
Article Google Scholar
Wu, X.H.: Auditory perception mechanism and computational auditory scene analysis. Post doctor research report (1997)
Cooke, M., Green, P., Josifovski, L., Vizinho, A.: Robust automatic speech recognition with missing and uncertain acoustic data. Speech Commun. 34, 267–285 (2001)
Article MATH Google Scholar
Raj, B., Seltzer, M.L., Stern, R.M.: Reconstruction of missing features for robust speech recognition. Speech Commun. 43(4), 275–296 (2004)
Article Google Scholar
Shao, Y., Wang, D.L.: Model-based sequential organization in cochannel speech. IEEE Trans. Audio Speech Lang. Process. 14, 289–298 (2006)
Article Google Scholar
Cooke, M.: A glimpsing model of speech perception in noise. J. Acoust. Soc. Am. 119(3), 1562–1573 (2006)
Article MathSciNet Google Scholar
Cooke, M., Barker, J., Cunningham, S., Shao, X.: An audio-visual corpus for speech perception and automatic speech recognition. J. Acoust. Soc. Am. 120(5), 2421–2424 (2006)
Article Google Scholar
Moharil, S., Lee, S.Y.: Load balancing on temporally heterogeneous cluster of workstations for parallel simulated annealing. Clust. Comput. 14(4), 295–310 (2011)
Article Google Scholar
Hasswa, A., Hassanein, H.: A smart spaces architecture based on heterogeneous contexts, particularly social contexts. Clust. Comput. 15(4), 373–390 (2012)
Article Google Scholar
Jung, Y.G., Han, M.S., Chung, K.Y., Lee, S.J.: Monotonicity and performance evaluation: applications to high speed and mobile networks. Clust. Comput. 15(4), 401–414 (2012)
Article Google Scholar
Kim, J.H., Chung, K.Y.: Ontology-based healthcare context information model to implement ubiquitous environment. Multimed. Tools Appl. (2013). doi:10.1007/s11042-011-0919-6
Google Scholar
Kim, J.H., Lee, D., Chung, K.Y.: Item recommendation based on context-aware model for personalized u-healthcare service. Multimed. Tools Appl. (2013). doi:10.1007/s11042-011-0920-0
Google Scholar
Chung, K.Y., Yoo, J., Kim, K.J.: Recent trends on mobile computing and future networks. Pers. Ubiquitous Comput. (2013). doi:10.1007/s00779-013-0682-y
Google Scholar
Kang, S.K., Chung, K.Y., Lee, J.H.: Development of head detection and tracking systems for visual surveillance. Pers. Ubiquitous Comput. (2013). doi:10.1007/s00779-013-0668-9
Google Scholar
Lee, K.D., Nam, M.Y., Chung, K.Y., Lee, Y.H., Kang, U.G.: Context and profile based cascade classifier for efficient people detection and safety care system. Multimed. Tools Appl. 63(1), 27–44 (2013)
Article Google Scholar
Jung, Y.G., Han, M.S., Chung, K.Y., Lee, S.J.: A study of a valid frequency range using correlation analysis of throat signal. Inf. Int. Interdiscip. J. 14(11), 3791–3799 (2011)
Google Scholar

Download references

Acknowledgements

This work was supported by the Gachon University research fund of 2013 (GCU-2013-R107).

Author information

Authors and Affiliations

Department of Interactive Media, Gachon University, Bokjeong-dong, Sujeong-gu, Seongnam-si, Gyeonggi-do, 461-701, Korea
Sang Yeob Oh
School of Computer Information Engineering, Sangji University, 63, Usan-dong, Wonju-si, Gangwon-do, 220-702, Korea
Kyung-Yong Chung

Authors

Sang Yeob Oh
View author publications
You can also search for this author in PubMed Google Scholar
Kyung-Yong Chung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sang Yeob Oh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Oh, S.Y., Chung, KY. Target speech feature extraction using non-parametric correlation coefficient. Cluster Comput 17, 893–899 (2014). https://doi.org/10.1007/s10586-013-0284-5

Download citation

Received: 16 April 2013
Accepted: 31 May 2013
Published: 19 June 2013
Issue Date: September 2014
DOI: https://doi.org/10.1007/s10586-013-0284-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Target speech feature extraction using non-parametric correlation coefficient

Abstract

Access this article

Similar content being viewed by others

Automatic Speech Recognition for Moroccan Dialects: A Review

Auditory processing-based features for improving speech recognition in adverse acoustic conditions

Robust Feature Extraction Based on Teager-Entropy and Half Power Spectrum Estimation for Speech Recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Target speech feature extraction using non-parametric correlation coefficient

Abstract

Access this article

Similar content being viewed by others

Automatic Speech Recognition for Moroccan Dialects: A Review

Auditory processing-based features for improving speech recognition in adverse acoustic conditions

Robust Feature Extraction Based on Teager-Entropy and Half Power Spectrum Estimation for Speech Recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation