Abstract
The work presents a method that allows delimiting the borders of allophones in isolated English words. The described method is based on the DTW algorithm combining two signals, a reference signal and an analyzed one. As the reference signal, recordings from the MODALITY database were used, from which the words were extracted. This database was also used for tests, which were described. Test results show that the automatic determination of the allophone limits in English words is possible with good accuracy. Tests have been carried out to determine the error of particular allophones borders marking and to find out the cost of matching the given allophone to the reference one. Based on this cost, a coefficient has been introduced that allows for determining in percentage how much the automatically marked allophone is similar to the reference one. This coefficient can be used for an assessment of the correctness of the pronunciation of the allophone. The possibilities of further research and development of this method were also analyzed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bellman, R., Kalaba, R.: On adaptive control processes, automatic control. IRE Trans. 4(2), 1–9 (1959)
Crystal, D.: English as a Global Language, 2nd edn. Cambridge University Press, Cambridge (2003)
Czyżewski, A., Ciszewski, T., Kostek, B.: Methodology and technology for the polymodal allophonic speech transcription. J. Acoust. Soc. Am. 139(4), 2017 (2017)
Czyżewski, A., Kostek, B., Bratoszewski, P., Kotus, J., Szykulski, M.: An audio-visual corpus for multimodal automatic speech recognition. J. Intell. Inf. Syst. 49(2), 167–192 (2017)
Gafos, A.: The Articulatory Basis of Locality in Phonology. Routledge Taylor & Francis Group, Abingdon (1999)
Harris, F.J.: On the use of windows for harmonic analysis with the discrete fourier transform. Proc. IEEE 66(1), 51–84 (1978)
Keogh, E.J., Pazzani, M.J.: Derivative dynamic time warping. In: the 1st SIAM International Conference on Data Mining, Chicago, IL, USA (2001)
Kiritani, S., Itoh, K., Hirose, H., Sawashima, M.: Coordination of the consonant and vowel articulations—X-ray microbeam study on Japanese and English. Ann. Bull. Res. Inst. Logoped. Phoniatry 11, 31–37 (1977)
Müller, M.: Information Retrieval for Music and Motion. Springer, Heidelberg (2007). Part I, chapter 4, Dynamic Time Warping, pp. 69–74
Myers, C.S., Rabiner, L.R.: A comparative study of several dynamic time-warping algorithms for connected word recognition. Bell Syst. Tech. J. 60, 1389–1409 (1981)
Rabiner, L.R., Rosenberg, A., Levinson, S.: Considerations in dynamic time warping algorithms for discrete word recognition. IEEE Trans. Acoust. Speech Signal Process. 26, 575–582 (1978)
Rafałko, J.: The algorithms of automation of the process of creating acoustic units databases in the polish speech synthesis. In: Atanassov, K.T., et al. (eds.) Novel Developments in Uncertainty Representation and Processing. AISC, vol. 401, pp. 373–383. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-26211-6_32
Rafałko, J.: Algorithm of allophone borders correction in automatic segmentation of acoustic units. In: Saeed, K., Homenda, W. (eds.) CISIM 2016. LNCS, vol. 9842, pp. 462–469. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45378-1_41
Rafalko, J., Czyżewski, A.: Adjusting automatically marked voiced English allophone borders. In: Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), Poznan, 18–20 September 2019. https://doi.org/10.23919/spa.2019.8936805
Salvador, S., Chan, P.: FastDTW: toward accurate dynamic time warping in linear time and space. In: KDD Workshop on Mining Temporal and Sequential Data, pp. 70–80 (2004)
Szpilewski, E., Piórkowska, B., Rafałko, J., Lobanov, B., Kiselov, V., Tsirulnik, L.: Polish TTS in multi-voice slavonic languages speech synthesis system. In: SPECOM’2004 Proceedings, 9th International Conference Speech and Computer, Saint-Petersburg, Russia, pp. 565–570 (2004)
Modality Corpus. http://www.modality-corpus.org. Accessed 26 Mar 2019
Acknowledgments
Research sponsored by the Polish National Science Centre, Dec. No. 2015/17/B/ST6/01874.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Rafałko, J., Czyżewski, A. (2020). Automatic Marking of Allophone Boundaries in Isolated English Spoken Words. In: Saeed, K., Dvorský, J. (eds) Computer Information Systems and Industrial Management. CISIM 2020. Lecture Notes in Computer Science(), vol 12133. Springer, Cham. https://doi.org/10.1007/978-3-030-47679-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-47679-3_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-47678-6
Online ISBN: 978-3-030-47679-3
eBook Packages: Computer ScienceComputer Science (R0)