Abstract
In this paper we propose an endpoint detection system based on the use of several features extracted from each speech frame, followed by a robust classifier (i.e Adaboost and Bagging of decision trees, and a multilayer perceptron) and a finite state automata (FSA). We present results for four different classifiers. The FSA module consisted of a 4-state decision logic that filtered false alarms and false positives. We compare the use of four different classifiers in this task. The look ahead of the method that we propose was of 7 frames, which are the number of frames that maximized the accuracy of the system. The system was tested with real signals recorded inside a car, with signal to noise ratio that ranged from 6 dB to 30dB. Finally we present experimental results demonstrating that the system yields robust endpoint detection.
This work has been partially supported by the Spanish CICyT project ALIADO, the EU integrated project CHIL and the University of Vic under the grant R0912.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Kaiser, J.F.: On a Simple Algorithm to Calculate the Energy of a Signal. In: Proc. ICASSP, pp. 381–384 (1990)
Ying, G.S., Mitchell, C.D., Jamieson, L.H.: Endpoint Detection of Isolated Utterances Based on a Modified Teager Energy Measurement. In: Proc. ICASSP, vol. II, pp. 732–735 (1993)
Shen, J.-l., Hung, J.-w., Lee, L.-s.: Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments. In: Proc. ICSLP CD-ROM (1998)
Shin, W.-H., Lee, B.-S., Lee, Y.-K., Lee, J.-S.: Speech/Non-Speech Classification Using Multiple Features For Robust Endpoint Detection. In: Proc. ICASSP, pp. 1399–1402 (2000)
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proc. 13th International Conference, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Moreno, A., Lindberg, B., Draxler, C., Richard, G., Choukri, K., Euler, S., Allen, J.: Speech Dat Car. A Large Speech Database For Automotive Environments. In: Proc. of the II Language Resources European Conference, Athens (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Comas, C., Monte-Moreno, E., Solé-Casals, J. (2005). A Robust Multiple Feature Approach to Endpoint Detection in Car Environment Based on Advanced Classifiers. In: Cabestany, J., Prieto, A., Sandoval, F. (eds) Computational Intelligence and Bioinspired Systems. IWANN 2005. Lecture Notes in Computer Science, vol 3512. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11494669_104
Download citation
DOI: https://doi.org/10.1007/11494669_104
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26208-4
Online ISBN: 978-3-540-32106-4
eBook Packages: Computer ScienceComputer Science (R0)