Automatic Marking of Allophone Boundaries in Isolated English Spoken Words

Rafałko, Janusz; Czyżewski, Andrzej

doi:10.1007/978-3-030-47679-3_5

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12133))

Included in the following conference series:

International Conference on Computer Information Systems and Industrial Management

617 Accesses

Abstract

The work presents a method that allows delimiting the borders of allophones in isolated English words. The described method is based on the DTW algorithm combining two signals, a reference signal and an analyzed one. As the reference signal, recordings from the MODALITY database were used, from which the words were extracted. This database was also used for tests, which were described. Test results show that the automatic determination of the allophone limits in English words is possible with good accuracy. Tests have been carried out to determine the error of particular allophones borders marking and to find out the cost of matching the given allophone to the reference one. Based on this cost, a coefficient has been introduced that allows for determining in percentage how much the automatically marked allophone is similar to the reference one. This coefficient can be used for an assessment of the correctness of the pronunciation of the allophone. The possibilities of further research and development of this method were also analyzed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bellman, R., Kalaba, R.: On adaptive control processes, automatic control. IRE Trans. 4(2), 1–9 (1959)
MATH Google Scholar
Crystal, D.: English as a Global Language, 2nd edn. Cambridge University Press, Cambridge (2003)
Book Google Scholar
Czyżewski, A., Ciszewski, T., Kostek, B.: Methodology and technology for the polymodal allophonic speech transcription. J. Acoust. Soc. Am. 139(4), 2017 (2017)
Article Google Scholar
Czyżewski, A., Kostek, B., Bratoszewski, P., Kotus, J., Szykulski, M.: An audio-visual corpus for multimodal automatic speech recognition. J. Intell. Inf. Syst. 49(2), 167–192 (2017)
Article Google Scholar
Gafos, A.: The Articulatory Basis of Locality in Phonology. Routledge Taylor & Francis Group, Abingdon (1999)
Google Scholar
Harris, F.J.: On the use of windows for harmonic analysis with the discrete fourier transform. Proc. IEEE 66(1), 51–84 (1978)
Article Google Scholar
Keogh, E.J., Pazzani, M.J.: Derivative dynamic time warping. In: the 1st SIAM International Conference on Data Mining, Chicago, IL, USA (2001)
Google Scholar
Kiritani, S., Itoh, K., Hirose, H., Sawashima, M.: Coordination of the consonant and vowel articulations—X-ray microbeam study on Japanese and English. Ann. Bull. Res. Inst. Logoped. Phoniatry 11, 31–37 (1977)
Google Scholar
Müller, M.: Information Retrieval for Music and Motion. Springer, Heidelberg (2007). Part I, chapter 4, Dynamic Time Warping, pp. 69–74
Book Google Scholar
Myers, C.S., Rabiner, L.R.: A comparative study of several dynamic time-warping algorithms for connected word recognition. Bell Syst. Tech. J. 60, 1389–1409 (1981)
Article Google Scholar
Rabiner, L.R., Rosenberg, A., Levinson, S.: Considerations in dynamic time warping algorithms for discrete word recognition. IEEE Trans. Acoust. Speech Signal Process. 26, 575–582 (1978)
Article Google Scholar
Rafałko, J.: The algorithms of automation of the process of creating acoustic units databases in the polish speech synthesis. In: Atanassov, K.T., et al. (eds.) Novel Developments in Uncertainty Representation and Processing. AISC, vol. 401, pp. 373–383. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-26211-6_32
Chapter Google Scholar
Rafałko, J.: Algorithm of allophone borders correction in automatic segmentation of acoustic units. In: Saeed, K., Homenda, W. (eds.) CISIM 2016. LNCS, vol. 9842, pp. 462–469. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45378-1_41
Chapter Google Scholar
Rafalko, J., Czyżewski, A.: Adjusting automatically marked voiced English allophone borders. In: Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), Poznan, 18–20 September 2019. https://doi.org/10.23919/spa.2019.8936805
Salvador, S., Chan, P.: FastDTW: toward accurate dynamic time warping in linear time and space. In: KDD Workshop on Mining Temporal and Sequential Data, pp. 70–80 (2004)
Google Scholar
Szpilewski, E., Piórkowska, B., Rafałko, J., Lobanov, B., Kiselov, V., Tsirulnik, L.: Polish TTS in multi-voice slavonic languages speech synthesis system. In: SPECOM’2004 Proceedings, 9th International Conference Speech and Computer, Saint-Petersburg, Russia, pp. 565–570 (2004)
Google Scholar
Modality Corpus. http://www.modality-corpus.org. Accessed 26 Mar 2019

Download references

Acknowledgments

Research sponsored by the Polish National Science Centre, Dec. No. 2015/17/B/ST6/01874.

Author information

Authors and Affiliations

Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
Janusz Rafałko
Faculty of Electronics, Telecommunications, and Informatics, Gdańsk University of Technology, Gdańsk, Poland
Andrzej Czyżewski

Authors

Janusz Rafałko
View author publications
You can also search for this author in PubMed Google Scholar
Andrzej Czyżewski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Janusz Rafałko .

Editor information

Editors and Affiliations

Bialystok University of Technology, Bialystok, Poland
Khalid Saeed
VSB - Technical University of Ostrava, Ostrava, Czech Republic
Jiří Dvorský

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rafałko, J., Czyżewski, A. (2020). Automatic Marking of Allophone Boundaries in Isolated English Spoken Words. In: Saeed, K., Dvorský, J. (eds) Computer Information Systems and Industrial Management. CISIM 2020. Lecture Notes in Computer Science(), vol 12133. Springer, Cham. https://doi.org/10.1007/978-3-030-47679-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-47679-3_5
Published: 22 May 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-47678-6
Online ISBN: 978-3-030-47679-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics