Abstract
This paper presents a framework for evaluating phonetic feature extraction engines in a phone identification task. The case study involves HMM-based feature extraction engines for fricative and vocalic which are evaluated both at the feature level and also on how they perform when coupled with a knowledge-based phone identification model. An exact comparison model is defined and performance of the feature extraction engines is measured with respect to the degradation in accuracy as each individual feature or combination of features are introduced incrementally into the input data. This type of diagnostic evaluation facilitates a more detailed investigation of how each feature impacts on the performance of the system as a whole and provides important insights for enhancing the performance of feature extraction engines in the context of automatic speech recognition.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Carson-Berndsen, J.: Time Map Phonology: Finite State Models and Event Logics in Speech Recognition. Kluwer Academic Publishers, Dordrecht (1998)
Juneja, A., Espy-Wilson, C.: An event-based acoustic-phonetic approach to speech segmentation and e-set recognition. In: Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona (2003)
Ali, A., van der Spiegel, J.: Acoustic-phonetic features for the automatic classification of fricatives. Journal of the Acoustical Society of America 109, 2217–2235 (2001)
Chang, S., Greenberg, S., Wester, M.: An elitist approach to articulatory-acoustic feature classification. In: Proceedings of Eurospeech, pp. 1725–1728 (2001)
Frankel, J., Wester, M., King, S.: Articulatory feature recognition using dynamic Bayesian networks. In: Proceedings of ICSLP (2004)
Walsh, M.: Recasting the time map model as a multi-agent system. In: Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona (2003)
Aioanei, D., Neugebauer, M., Carson-Berndsen, J.: Efficient phonetic interpretation of multilinear feature representations for speech recognition. In: Proceedings of the 2nd Language & Technology Conference, Adam Mickiewicz University, Poznan, Poland (2005)
Macek, J., Kanokphara, S., Geumann, A.: Articulatory-acoustic feature recognition: Comparison of machine learning and HMM methods. In: Proceedings of the 10th International Conference on Speech and Computer (SPECOM 2005), Patras, Greece, pp. 99–102 (2005)
Garofolo, J., Lamel, L., Fisher, W., Fiscus, J., Pallett, D., Dahlgren, N.: The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM. NIST (1993)
Neugebauer, M.: Machine Learning and Phonological Classification. In: Proceedings of the TAAL Postgraduate Conference, University of Edinburgh (2003)
Young, S., Evermann, G., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book (for HTK Version 3.2.1) (2002)
Kanokphara, S., Carson-Berndsen, J.: Better HMM-based articulatory feature extraction with context-dependent model. In: Proceedings of the 18th International Florida Artificial Intelligence Research (2005)
Tarsaku, P., Kanokphara, S.: A study of HMM-based automatic segmentations for thai continuous speech recognition system. In: Proceedings of the Symposium on Natural Language Processing, pp. 217–220 (2002)
NIST: sctk-1.3 speech recognition scoring toolkit (1996), http://www.nist.gov/speech/tools
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Aioanei, D., Carson-Berndsen, J., Kanokphara, S. (2006). Diagnostic Evaluation of Phonetic Feature Extraction Engines: A Case Study with the Time Map Model. In: Ali, M., Dapoigny, R. (eds) Advances in Applied Artificial Intelligence. IEA/AIE 2006. Lecture Notes in Computer Science(), vol 4031. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11779568_75
Download citation
DOI: https://doi.org/10.1007/11779568_75
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35453-6
Online ISBN: 978-3-540-35454-3
eBook Packages: Computer ScienceComputer Science (R0)