Speaker-Independent Automatic Speech Recognition System for Mobile Phone Applications in Punjabi

Mittal, Puneet; Singh, Navdeep

doi:10.1007/978-3-319-67934-1_33

Puneet Mittal²⁰ &
Navdeep Singh²¹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 678))

Included in the following conference series:

International Symposium on Signal Processing and Intelligent Recognition Systems

1644 Accesses
2 Citations

Abstract

Speaker-independent Automatic Speech Recognition (ASR) system based mobile phone applications are gaining popularity due to technological advancements and accessibility. Speech based applications may provide mobile phone accessibility and comfort to people performing activities where hand-free phone access is desirable e.g. drivers, athletes, machine operators etc. Similarly, users with disabilities like low vision, blindness and physically challenged may use it as an assistive technology. Development of ASR system for a specific language needs accurate, reliable and efficient acoustic model having language-specific pronunciation dictionary. Punjabi language is one of the popular languages worldwide having more than 150 million speakers. Three acoustic models- continuous, semi-continuous and phonetically-tied are developed based on three pronunciation dictionaries- word, sub-word and character based. Analysis of performance results validate Punjabi language principle “One word one sound” by having better accuracy and reliability for character based pronunciation dictionary than others. Further, phonetically-tied model outperforms others in terms of accuracy, word error rate and size due to reasonable number of Gaussians.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bharali, S.S., Kalita, S.K.: A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language. Int. J. Speech Technol. 18(4), 673–684 (2015)
Article Google Scholar
Commissioner for Linguistic Minorities, Ministry of Minority Affairs, Government of India. 50th Report of the Commissioner for Linguistic Minorities in India. http://www.nclm.nic.in/shared/linkimages/NCLM50thReport.pdf. Accessed 14 Jul 2016
Das, B., Mandal, S., Mitra, P.: Bengali speech corpus for continuous automatic speech recognition system. In: International Conference on Speech Database and Assessments Proceedings, Taiwan, pp. 51–55 (2011)
Google Scholar
Davis, K.H., Biddulph, R., Balashek, S.: Automatic recognition of spoken digits. J. Acoust. Soc. America 24, 637–642 (1952)
Article Google Scholar
Dua, M., Aggarwal, R.K., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. Int. J. Comput. Sci. 9(4), 359–364 (2012)
Google Scholar
Ho, T.H., Liu, C.J., Sun, H.: Phonetic State Tied-Mixture tone modeling for large vocabulary continuous mandarin speech recognition. In: Sixth European Conference on Speech Communication and Technology Proceedings, Hungary, pp. 883–886 (1999)
Google Scholar
Huang, X., Alleva, F., Hon, H.W., Hwang, M.Y., Rosenfeld, R.: The SPHINX-II speech system: an overview. Comput. Speech Lang. 7(2), 137–148 (1993)
Article Google Scholar
Huggins-Daines, D., Kumar, M., Chan, A.: Pocketsphinx: a free, real-time continuous speech recognition system for hand-held devices. In: International Conference on Acoustics, Speech and Signal Processing Proceedings, pp. I-185–I-188. IEEE, Toulouse (2006)
Google Scholar
Khaira, S.S.: Punjabi Bhasha Viyakarn Ate Bantar (Punjabi). Punjabi University, Patiala (2011)
Google Scholar
Klatt, D.H.: Review of the ARPA speech understanding project. J. Acoust. Soc. America 62(6), 1345–1366 (1977)
Article Google Scholar
Kumar, K., Aggarwal, R.K.: A Hindi speech recognition system for connected words using HTK. Int. J. Comput. Sys. Eng. 1(1), 25–32 (2012)
Article MathSciNet Google Scholar
Kumar, R.: Comparison of HMM and DTW for Isolated Word Recognition System of Punjabi Language. In: 15th Iberoamerican Congress on Pattern Recognition Proceedings, SP, Brazil, pp. 244–252 (2010)
Google Scholar
Kumar, Y., Singh, N.: An automatic spontaneous live speech recognition system for Punjabi Language corpus. Int. J. CTA 9(20), 9575–9595 (2016)
Google Scholar
Kumar, Y., Singh, N.: An automatic speech recognition system for spontaneous Punjabi speech corpus. Int. J. Speech Technol. 20(2), 297–303 (2017)
Article Google Scholar
Lee, K.F., Hon, H.W., Reddy, R.: An overview of the SPHINX speech recognition system. IEEE Trans. Acoust. Speech Signal Process. 38(1), 35–45 (1990)
Article Google Scholar
Lowerre, B.T.: The Harpy Speech Recognition System. Dissertation, CMU (1976)
Google Scholar
Mittal, P., Singh, N.: Speech based command and control system for mobile phones: issues and challenges. In: International Conference on Computational intelligence and communication technology Proceedings, pp. 729–732. IEEE, Ghaziabad (2016)
Google Scholar
Naing, H.M.S., Hlaing, A.M., Pa, W.P.: A Myanmar large vocabulary continuous speech recognition system. In: APSIPA Annual Summit and Conference Proceedings, Hong Kong, pp. 320–327 (2015)
Google Scholar
Placeway, P., Chen, S., Eskenazi, M.: The 1996 HUB-4 Sphinx-3 system, In: DARPA Speech Recognition Workshop Chantilly Proceedings (1996). http://www.itl.nist.gov/iad/mig/publications/proceedings/darpa97/pdf/placewa1.pdf. Accessed 09 Sept 2016
Punjab Population Census data. http://www.census2011.co.in/census/state/punjab.html. Accessed 14 Jul 2016
Punjabi Language, Encyclopedia Britannica Online. https://www.britannica.com/topic/Punjabi-language. Accessed 05 Jul 2016
Satori, H., ElHaoussi, F.: Investigation Amazigh speech recognition using CMU tools. Int. J. Speech Technol. 17, 235–243 (2014)
Article Google Scholar
Schalkwyk, J., Beeferman, D., Beaufays, F.: Google search by voice: a case study. In: Advances in Speech Recognition: Mobile Environments, Call Centers and Clinics Proceedings, pp. 61–90. Springer (2010)
Google Scholar
Schuster, M., Nakajima, K.: Japanese and Korean voice search. In: International Conference on Acoustics, Speech, and Signal Processing Proceedings, pp. 5149–5152. IEEE, Kyoto (2012)
Google Scholar
Thangarajan, R., Natarajan, A.M., Selvam, M.: Syllable modeling in continuous speech recognition for Tamil language. Int. J. Speech Technol. 12, 47–57 (2009)
Article Google Scholar
Walha, R., Drira, F., El-Abed, H., Alimi, A.M.: On developing an automatic speech recognition system for standard Arabic language. Int. J. Electr. Comput. Energ. Electron. Commun. Eng. 6(10), 1138–1143 (2012)
Google Scholar
Wang, H.M., Ho, T.H., Yang, R.C.: Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary using limited training data. IEEE Trans. Speech Audio Process. 5(2), 195–200 (1997)
Article Google Scholar

Download references

Author information

Authors and Affiliations

BBSB Engineering College, Fatehgarh Sahib, India
Puneet Mittal
Mata Gujri College, Fatehgarh Sahib, India
Navdeep Singh

Authors

Puneet Mittal
View author publications
You can also search for this author in PubMed Google Scholar
Navdeep Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Puneet Mittal .

Editor information

Editors and Affiliations

School of CS/IT, Indian Institute of Information Technology and Management, Trivandrum, Kerala, India
Sabu M. Thampi
Department of Electrical and Computer Engineering, Ryerson University, Toronto, Ontario, Canada
Sri Krishnan
Department of Computer Science, University of Salamanca, Salamanca, Salamanca, Spain
Juan Manuel Corchado Rodriguez
Electronics and Communication Sciences Unit, Indian Statistical Institute, Kolkata, West Bengal, India
Swagatam Das
Department of Systems and Computer Networks, Wroclaw University of Science and Technology, Wroclaw, Poland
Michal Wozniak
Faculty of Engineering and Technology, Liverpool John Moores University, Liverpool, United Kingdom
Dhiya Al-Jumeily

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mittal, P., Singh, N. (2018). Speaker-Independent Automatic Speech Recognition System for Mobile Phone Applications in Punjabi. In: Thampi, S., Krishnan, S., Corchado Rodriguez, J., Das, S., Wozniak, M., Al-Jumeily, D. (eds) Advances in Signal Processing and Intelligent Recognition Systems. SIRS 2017. Advances in Intelligent Systems and Computing, vol 678. Springer, Cham. https://doi.org/10.1007/978-3-319-67934-1_33

Download citation

DOI: https://doi.org/10.1007/978-3-319-67934-1_33
Published: 27 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67933-4
Online ISBN: 978-3-319-67934-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics