Abstract
Speaker-independent Automatic Speech Recognition (ASR) system based mobile phone applications are gaining popularity due to technological advancements and accessibility. Speech based applications may provide mobile phone accessibility and comfort to people performing activities where hand-free phone access is desirable e.g. drivers, athletes, machine operators etc. Similarly, users with disabilities like low vision, blindness and physically challenged may use it as an assistive technology. Development of ASR system for a specific language needs accurate, reliable and efficient acoustic model having language-specific pronunciation dictionary. Punjabi language is one of the popular languages worldwide having more than 150 million speakers. Three acoustic models- continuous, semi-continuous and phonetically-tied are developed based on three pronunciation dictionaries- word, sub-word and character based. Analysis of performance results validate Punjabi language principle “One word one sound” by having better accuracy and reliability for character based pronunciation dictionary than others. Further, phonetically-tied model outperforms others in terms of accuracy, word error rate and size due to reasonable number of Gaussians.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bharali, S.S., Kalita, S.K.: A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language. Int. J. Speech Technol. 18(4), 673–684 (2015)
Commissioner for Linguistic Minorities, Ministry of Minority Affairs, Government of India. 50th Report of the Commissioner for Linguistic Minorities in India. http://www.nclm.nic.in/shared/linkimages/NCLM50thReport.pdf. Accessed 14 Jul 2016
Das, B., Mandal, S., Mitra, P.: Bengali speech corpus for continuous automatic speech recognition system. In: International Conference on Speech Database and Assessments Proceedings, Taiwan, pp. 51–55 (2011)
Davis, K.H., Biddulph, R., Balashek, S.: Automatic recognition of spoken digits. J. Acoust. Soc. America 24, 637–642 (1952)
Dua, M., Aggarwal, R.K., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. Int. J. Comput. Sci. 9(4), 359–364 (2012)
Ho, T.H., Liu, C.J., Sun, H.: Phonetic State Tied-Mixture tone modeling for large vocabulary continuous mandarin speech recognition. In: Sixth European Conference on Speech Communication and Technology Proceedings, Hungary, pp. 883–886 (1999)
Huang, X., Alleva, F., Hon, H.W., Hwang, M.Y., Rosenfeld, R.: The SPHINX-II speech system: an overview. Comput. Speech Lang. 7(2), 137–148 (1993)
Huggins-Daines, D., Kumar, M., Chan, A.: Pocketsphinx: a free, real-time continuous speech recognition system for hand-held devices. In: International Conference on Acoustics, Speech and Signal Processing Proceedings, pp. I-185–I-188. IEEE, Toulouse (2006)
Khaira, S.S.: Punjabi Bhasha Viyakarn Ate Bantar (Punjabi). Punjabi University, Patiala (2011)
Klatt, D.H.: Review of the ARPA speech understanding project. J. Acoust. Soc. America 62(6), 1345–1366 (1977)
Kumar, K., Aggarwal, R.K.: A Hindi speech recognition system for connected words using HTK. Int. J. Comput. Sys. Eng. 1(1), 25–32 (2012)
Kumar, R.: Comparison of HMM and DTW for Isolated Word Recognition System of Punjabi Language. In: 15th Iberoamerican Congress on Pattern Recognition Proceedings, SP, Brazil, pp. 244–252 (2010)
Kumar, Y., Singh, N.: An automatic spontaneous live speech recognition system for Punjabi Language corpus. Int. J. CTA 9(20), 9575–9595 (2016)
Kumar, Y., Singh, N.: An automatic speech recognition system for spontaneous Punjabi speech corpus. Int. J. Speech Technol. 20(2), 297–303 (2017)
Lee, K.F., Hon, H.W., Reddy, R.: An overview of the SPHINX speech recognition system. IEEE Trans. Acoust. Speech Signal Process. 38(1), 35–45 (1990)
Lowerre, B.T.: The Harpy Speech Recognition System. Dissertation, CMU (1976)
Mittal, P., Singh, N.: Speech based command and control system for mobile phones: issues and challenges. In: International Conference on Computational intelligence and communication technology Proceedings, pp. 729–732. IEEE, Ghaziabad (2016)
Naing, H.M.S., Hlaing, A.M., Pa, W.P.: A Myanmar large vocabulary continuous speech recognition system. In: APSIPA Annual Summit and Conference Proceedings, Hong Kong, pp. 320–327 (2015)
Placeway, P., Chen, S., Eskenazi, M.: The 1996 HUB-4 Sphinx-3 system, In: DARPA Speech Recognition Workshop Chantilly Proceedings (1996). http://www.itl.nist.gov/iad/mig/publications/proceedings/darpa97/pdf/placewa1.pdf. Accessed 09 Sept 2016
Punjab Population Census data. http://www.census2011.co.in/census/state/punjab.html. Accessed 14 Jul 2016
Punjabi Language, Encyclopedia Britannica Online. https://www.britannica.com/topic/Punjabi-language. Accessed 05 Jul 2016
Satori, H., ElHaoussi, F.: Investigation Amazigh speech recognition using CMU tools. Int. J. Speech Technol. 17, 235–243 (2014)
Schalkwyk, J., Beeferman, D., Beaufays, F.: Google search by voice: a case study. In: Advances in Speech Recognition: Mobile Environments, Call Centers and Clinics Proceedings, pp. 61–90. Springer (2010)
Schuster, M., Nakajima, K.: Japanese and Korean voice search. In: International Conference on Acoustics, Speech, and Signal Processing Proceedings, pp. 5149–5152. IEEE, Kyoto (2012)
Thangarajan, R., Natarajan, A.M., Selvam, M.: Syllable modeling in continuous speech recognition for Tamil language. Int. J. Speech Technol. 12, 47–57 (2009)
Walha, R., Drira, F., El-Abed, H., Alimi, A.M.: On developing an automatic speech recognition system for standard Arabic language. Int. J. Electr. Comput. Energ. Electron. Commun. Eng. 6(10), 1138–1143 (2012)
Wang, H.M., Ho, T.H., Yang, R.C.: Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary using limited training data. IEEE Trans. Speech Audio Process. 5(2), 195–200 (1997)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Mittal, P., Singh, N. (2018). Speaker-Independent Automatic Speech Recognition System for Mobile Phone Applications in Punjabi. In: Thampi, S., Krishnan, S., Corchado Rodriguez, J., Das, S., Wozniak, M., Al-Jumeily, D. (eds) Advances in Signal Processing and Intelligent Recognition Systems. SIRS 2017. Advances in Intelligent Systems and Computing, vol 678. Springer, Cham. https://doi.org/10.1007/978-3-319-67934-1_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-67934-1_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67933-4
Online ISBN: 978-3-319-67934-1
eBook Packages: EngineeringEngineering (R0)