Pronunciation Feature Extraction

Hacker, Christian; Cincarek, Tobias; Gruhn, Rainer; Steidl, Stefan; Nöth, Elmar; Niemann, Heinrich

doi:10.1007/11550518_18

Christian Hacker¹⁹,
Tobias Cincarek²⁰,
Rainer Gruhn²⁰,
Stefan Steidl¹⁹,
Elmar Nöth¹⁹ &
…
Heinrich Niemann¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3663))

Included in the following conference series:

Joint Pattern Recognition Symposium

1955 Accesses

Abstract

Automatic pronunciation scoring makes novel applications for computer assisted language learning possible. In this paper we concentrate on the feature extraction. A relatively large feature vector with 28 sentence- and 33 word-level features has been designed. On the word-level correctly and mispronounced words are classified, on the sentence-level utterances are rated with 5 discrete marks. The features are evaluated on two databases with non-native adults’ and children’s speech, respectively. Up to 72 % class-wise-averaged recognition rate is achieved for 2 classes; the result of the 5-class problem can be interpreted as 80 % recognition rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Evaluating Different Non-native Pronunciation Scoring Metrics with the Japanese Speakers of the SAMPLE Corpus

Adaptation Approaches for Pronunciation Scoring with Sparse Training Data

Automatic Pronunciation Assessment of Non-native English Based on Phonological Analysis

References

Cincarek, T., Gruhn, R., Hacker, C., Nöth, E., Nakamura, S.: Pronunciation Scoring and Extraction of Mispronounced Words for Non-Native Speech. In: Proc. Acoustical Society of Japan, pp. 141–142 (2004)
Google Scholar
Cucchiarini, C., Strik, H., Boves, L.: Different Aspects of Expert Pronunciation Quality Ratings and their Relation to Scores Produced by Speech Recognition Algorithms. Speech Communication 30, 109–119 (2000)
Article Google Scholar
D’Arcy, S.M., Wong, L.P., Russell, M.J.: Recognition of Read and Spontaneous Children’s Speech Using two New Corpora. In: Proc. ICSLP, Korea (2004)
Google Scholar
Franco, H., Neumeyer, L., Digalakis, V., Ronen, O.: Combination of Machine Scores for Automatic Grading of Pronunciation Quality. Speech Communication 30, 121–130 (2000)
Article Google Scholar
Gruhn, R., Cincarek, T., Nakamura, S.: A Multi-Accent Non-Native English Database. In: Proc. of the Acoustical Society of Japan (2004)
Google Scholar
Minematsu, N.: Pronunciation Assessment Based upon Phonological Distortions Observed in Language Learners’ Utterances. In: Proc. ICSLP, Korea (2004)
Google Scholar
Neumeyer, L., Franco, H., Digalakis, V., Weintraub, M.: Automatic Scoring of Pronunciation Quality. Speech Communication 30, 83–93 (2000)
Article Google Scholar
Stemmer, G., Hacker, C., Steidl, S., Nöth, E.: Acoustic Normalization of Children’s Speech. In: Proc. Eurospeech, Geneva, Switzerland, pp. 1313–1316 (2003)
Google Scholar
Witt, S.M., Young, S.J.: Language Learning Based on Non-Native Speech Recognition. In: Proc. Eurospeech, Rhodes, Greece, pp. 633–636 (1997)
Google Scholar
Witt, S.M., Young, S.J.: Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning. Speech Communication 30, 95–108 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Lehrstuhl für Mustererkennung, Universität Erlangen-Nürnberg, Martensstraße 3, D-91058, Erlangen, Germany
Christian Hacker, Stefan Steidl, Elmar Nöth & Heinrich Niemann
ATR Spoken Language Translation Res. Labs., Kyoto, Japan
Tobias Cincarek & Rainer Gruhn

Authors

Christian Hacker
View author publications
You can also search for this author in PubMed Google Scholar
Tobias Cincarek
View author publications
You can also search for this author in PubMed Google Scholar
Rainer Gruhn
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Steidl
View author publications
You can also search for this author in PubMed Google Scholar
Elmar Nöth
View author publications
You can also search for this author in PubMed Google Scholar
Heinrich Niemann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

PRIP, Vienna University of Technology, Austria
Walter G. Kropatsch
Vienna University of Technology, Vienna, Austria
Robert Sablatnig
Pattern Recognition and Image Processing Group, Institute of Computer-Aided Automation, Vienna University of Technology, Favoritenstraße 9/1832, A-1040, Vienna, Austria
Allan Hanbury

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hacker, C., Cincarek, T., Gruhn, R., Steidl, S., Nöth, E., Niemann, H. (2005). Pronunciation Feature Extraction. In: Kropatsch, W.G., Sablatnig, R., Hanbury, A. (eds) Pattern Recognition. DAGM 2005. Lecture Notes in Computer Science, vol 3663. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11550518_18

Download citation

DOI: https://doi.org/10.1007/11550518_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28703-2
Online ISBN: 978-3-540-31942-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Pronunciation Feature Extraction

Abstract

Access this chapter

Preview

Similar content being viewed by others

Evaluating Different Non-native Pronunciation Scoring Metrics with the Japanese Speakers of the SAMPLE Corpus

Adaptation Approaches for Pronunciation Scoring with Sparse Training Data

Automatic Pronunciation Assessment of Non-native English Based on Phonological Analysis

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Pronunciation Feature Extraction

Abstract

Access this chapter

Preview

Similar content being viewed by others

Evaluating Different Non-native Pronunciation Scoring Metrics with the Japanese Speakers of the SAMPLE Corpus

Adaptation Approaches for Pronunciation Scoring with Sparse Training Data

Automatic Pronunciation Assessment of Non-native English Based on Phonological Analysis

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation