Annotation Error Detection: Anomaly Detection vs. Classification

Matoušek, Jindřich; Tihelka, Daniel

doi:10.1007/978-3-319-66429-3_13

Jindřich Matoušek^16,17 &
Daniel Tihelka¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10458))

Included in the following conference series:

International Conference on Speech and Computer

2230 Accesses
1 Citations

Abstract

We compare two approaches to automatic detection of annotation errors in single-speaker read-speech corpora used for speech synthesis: anomaly- and classification-based detection. Both approaches principally differ in that the classification-based approach needs to use both correctly annotated and misannotated words for training. On the other hand, the anomaly-based detection approach needs only the correctly annotated words for training (plus a few misannotated words for validation). We show that both approaches lead to statistically comparable results when all available misannotated words are utilized during detector/classifier development. However, when a smaller number of misannotated words are used, the anomaly detection framework clearly outperforms the classification-based approach. A final listening test showed the effectiveness of the annotation error detection for improving the quality of synthetic speech.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Boeffard, O., Charonnat, L., Maguer, S.L., Lolive, D., Vidal, G.: Towards fully automatic annotation of audiobooks for TTS. In: Language Resources and Evaluation Conference, Istanbul, Turkey, pp. 975–980 (2012)
Google Scholar
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)
Article Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1/3), 389–422 (2002)
Article MATH Google Scholar
Kala, J., Matoušek, J.: Very fast unit selection using Viterbi search with zero-concatenation-cost chains. In: IEEE International Conference on Acoustics Speech and Signal Processing, Florence, Italy, pp. 2569–2573 (2014)
Google Scholar
Matoušek, J., Romportl, J.: Recording and annotation of speech corpus for czech unit selection speech synthesis. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS, vol. 4629, pp. 326–333. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74628-7_43
Chapter Google Scholar
Matoušek, J., Romportl, J.: Automatic pitch-synchronous phonetic segmentation. In: INTERSPEECH, Brisbane, Australia, pp. 1626–1629 (2008)
Google Scholar
Matoušek, J., Tihelka, D.: Anomaly-based annotation errors detection in TTS corpora. In: INTERSPEECH, Dresden, Germany, pp. 314–318 (2015)
Google Scholar
Matoušek, J., Tihelka, D.: On the influence of the number of anomalous and normal examples in anomaly-based annotation errors detection. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2016. LNCS, vol. 9924, pp. 326–334. Springer, Cham (2016). doi:10.1007/978-3-319-45510-5_37
Chapter Google Scholar
Matoušek, J., Tihelka, D.: Anomaly-based annotation error detection in speech-synthesis corpora. Comput. Speech Lang. 46, 1–35 (2017)
Article Google Scholar
Matoušek, J., Tihelka, D., Šmídl, L.: On the impact of annotation errors on unit-selection speech synthesis. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 456–463. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32790-2_55
Chapter Google Scholar
Meinedo, H., Neto, J.: Automatic speech annotation and transcription in a broadcast news task. In: ISCA Workshop on Multilingual Spoken Document Retrieval, Hong Kong, pp. 95–100 (2003)
Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Thirion, V.M.B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perror, M., Duchesnay, É.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Salzberg, S.: On comparing classifiers: pitfalls to avoid and a recommended approach. Data Min. Knowl. Disc. 328, 317–328 (1997)
Article Google Scholar
Tachibana, R., Nagano, T., Kurata, G., Nishimura, M., Babaguchi, N.: Preliminary experiments toward automatic generation of new TTS voices from recorded speech alone. In: INTERSPEECH, Antwerp, Belgium, pp. 1917–1920 (2007)
Google Scholar
Young, S., Evermann, G., Gales, M.J.F., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: HTK Book (for HTK Version 3.4). Cambridge University, Cambridge (2006)
Google Scholar

Download references

Acknowledgments

This research was supported by the Czech Science Foundation (GA CR), project No. GA16-04420S. The access to the MetaCentrum clusters provided under the programme LM2015042 is highly appreciated.

Author information

Authors and Affiliations

Department of Cybernetics, University of West Bohemia, Pilsen, Czech Republic
Jindřich Matoušek
Faculty of Applied Sciences, New Technology for the Information Society (NTIS), University of West Bohemia, Pilsen, Czech Republic
Jindřich Matoušek & Daniel Tihelka

Authors

Jindřich Matoušek
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Tihelka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jindřich Matoušek .

Editor information

Editors and Affiliations

SPIIRAS, Saint Petersburg, Russia
Alexey Karpov
Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova
University of Hertfordshire, Hatfield, United Kingdom
Iosif Mporas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Matoušek, J., Tihelka, D. (2017). Annotation Error Detection: Anomaly Detection vs. Classification. In: Karpov, A., Potapova, R., Mporas, I. (eds) Speech and Computer. SPECOM 2017. Lecture Notes in Computer Science(), vol 10458. Springer, Cham. https://doi.org/10.1007/978-3-319-66429-3_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-66429-3_13
Published: 13 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66428-6
Online ISBN: 978-3-319-66429-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics