Skip to main content

Design and Development of Marathi Speech Interface System

  • Chapter
  • First Online:
Advanced Computing and Systems for Security

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 396))

Abstract

Speech is the most prominent and natural form of communication between humans. It has potential of being an important mode of interaction with computer. Man–machine interface has always been proven to be a challenging area in natural language processing and in speech recognition research. There are growing interests in developing machines that can accept speech as input. Normal person generally communicate with the computer through a mouse or keyboard. It requires training and hard work as well as knowledge about computer, which is a limitation at certain levels. Marathi is used as official language at government of Maharashtra. There is a need for developing systems that enable human–machine interaction in Indian regional languages. The objective of this research is to design and development of the Marathi speech Activated Talking Calculator (MSAC) as an interface system. The MSAC is speaker-dependent speech recognition system that is used to perform basic mathematical operation. It can recognize isolated spoken digit from 0 to 50 and basic operation like addition, subtraction, multiplication, start, stop, equal, and exit. Database is an essential requirement to design the speech recognition system. To reach up to the objectives set, a database having 22,320 sizes of vocabularies is developed. The MSAC system trained and tested using the Mel Frequency Cepstral Coefficients (MFCC), Linear Discriminative Analysis (LDA), Principal Component Analysis (PCA), Linear Predictive Codding (LPC), and Rasta-PLP individually. Training and testing of MSAC system are done with individually Mel Frequency Linear Discriminative Analysis (MFLDA), Mel Frequency Principal Component Analysis (MFPCA), Mel Frequency Discrete Wavelet Transformation (MFDWT), and Mel Frequency Linear Discrete Wavelet Transformation (MFLDWT) fusion feature extraction techniques. This experiment is proposed and tested the Wavelet Decomposed Cepstral Coefficient (WDCC) with 18, 36, and 54 coefficients approach. The performance of MSAC system is calculated on the basis of accuracy and real-time factor (RTF). From the experimental results, it is observed that the MFCC with 39 coefficients achieved higher accuracy than 13 and 26 variations. The MFLDWT is proven higher accuracy than MFLDA, MFPCA, MFDWT, and Mel Frequency Principal Discrete Wavelet Transformation (MFPDWT). From this research, we recommended that WDCC is robust and dynamic techniques than MFCC, LDA, PCA, and LPC. MSAC interface application is directly beneficial for society people for their day to day activity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. A review on speech recognition technique. Int. J. Comput. Appl. 10(3), 0975–8887 (2010)

    Google Scholar 

  2. Picheny, M.: Large vocabulary speech recognition 35(4):42–50 (2002)

    Google Scholar 

  3. Arokia Raj, A., Susmitha, R.C.: A voice interface for the visually impaired. In: 3rd International Conference: Sciences of Electronic, Technologies of Information and Telecommunications March 27–31, Tunisia (2005)

    Google Scholar 

  4. Roux, J.C., Botha, E.C., Du Preez, J.A.: Developing a multilingual telephone based information system in African languages. In: Proceedings of the Second International Language Resources and Evaluation Conference, no. 2, pp. 975–980. ELRA, Athens (2000)

    Google Scholar 

  5. Robertson, J., Wong, Y.T., Chung, C., Kim, D.K.: Automatic speech recognition for generalized time based media retrieval and indexing. In: Proceedings of the Sixth ACM International Conference on Multimedia, pp. 241–246. Bristol (1998)

    Google Scholar 

  6. Scan soft: Embedded speech solutions. http://www.speechworks.com/ (2004). Accessed 25 Jan 2013

  7. Kandasamy, S.: Speech recognition systems. SURPRISE J. 1(1) (1995)

    Google Scholar 

  8. Dusan, S., Rabiner, L.R.: On integrating insights from human speech perception into automatic speech recognition. In: Proceedings of INTERSPEECH 2005. Lisbon (2005)

    Google Scholar 

  9. Shrawankar, U., Thakare, V.: Speech user interface for computer based education system. In: International Conference on Signal and Image Processing (ICSIP), pp. 148–152 (2010) (15–17 Dec)

    Google Scholar 

  10. Alt, F.L., Rubinoff, M., Yovitts, M.C.: Advances in Computers, pp. 165–230. Academic Press, New York

    Google Scholar 

  11. Rebman Jr., C.M., Aiken, M.W., Cegielski, C.G.: Speech Recognition in the Human–Computer Interface, vol. 40, Issue 6, pp. 509–519, Information & Management. Elsevier (2003)

    Google Scholar 

  12. Furui, S.: 50 Years of progress in speech and speaker recognition research. ECTI Trans. Comput. Inf. Technol. 1(2) (2005)

    Google Scholar 

  13. Nehe, N.S., Holambe, R.S.: New feature extraction techniques for Marathi digit recognition. Int. J. Recent Trends Eng. 2(2) (2009)

    Google Scholar 

  14. Bhosale, R.S.: Enhanced speech recognition using ADAG SVM approach. Int. J. Emerg. Trends Technol. Comput. Sci. (IJETTCS) 1(4) (2012)

    Google Scholar 

  15. Anumanchipalli, G., Chitturi, R., Joshi, S., Kumar, R., Singh, S.P., Sitaram, R.N.V., Kishore, S.P.: Development of indian language speech databases for large vocabulary speech recognition systems. In: Proceedings of International Conference on Speech and Computer (SPECOM). Patras (2005)

    Google Scholar 

  16. Neti, C., Rajput, N., Verma, A.: A large vocabulary continuous speech recognition system for Hindi. In: Proceedings of the National conference on Communications, pp. 366–370. Mumbai (2002)

    Google Scholar 

  17. Gawali, B.W., Gaikwad, S., Yannawar, P., Mehrotra, S.C.: Marathi Isolated Word Recognition System using MFCC and DTW Features. ACEEE (2010)

    Google Scholar 

  18. Chakraborty, K., Talele, A., Upadhya, S.: Voice recognition using MFCC algorithm. Int. J. Innovative Res. Adv. Eng. (IJIRAE) 1(10) (2014). ISSN: 2349-2163

    Google Scholar 

  19. Patel, K., Prasad, R.K.: Speech recognition and verification using MFCC & VQ. Int. J. Emerg. Sci. Eng. (IJESE) 1(7) (2013). ISSN: 2319–6378

    Google Scholar 

  20. Oh-Wook Kwon, Chan, K., Lee, T.-W.: Speech feature analysis using variational bayesian PCA. IEEE Signal Process. Lett. 10, 137–140 (2003)

    Google Scholar 

  21. Gaikwad, S., Gawali, B., Mehrotra, S.C.: Novel Approach Based Feature Extraction For Marathi Continuous Speech Recognition, pp. 795–804. ACM Digital Library, New York (2012). ISBN: 978-1-4503-1196-0/2012

    Google Scholar 

  22. Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE Trans. Speech Audio Process. 2, 578–589 (1994). doi:10.1109/89.326616

    Article  Google Scholar 

  23. Ali, H., Ahmad, N., Zhou, X., Iqbal, K., Muhammad Ali, S.: DWT features performance analysis for automatic speech recognition of Urdu. SpringerPlus 3:204 (2014) doi:10.1186/2193-1801-3-204

  24. Tiwari, A., Zadgaonkar, A.S.: Debauchee’s wavelet analysis of speech signal of different speakers for similar speech set. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 4(8) (2014)

    Google Scholar 

  25. Mallat, S.: A Wavelet Tour of Signal Processing. Academic Press (1998)

    Google Scholar 

  26. Chan, Y.T.: Wavelet Basics. Kulwer Academic Publications (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Santosh Gaikwad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer India

About this chapter

Cite this chapter

Gaikwad, S., Gawali, B., Mehrotra, S. (2016). Design and Development of Marathi Speech Interface System. In: Chaki, R., Cortesi, A., Saeed, K., Chaki, N. (eds) Advanced Computing and Systems for Security. Advances in Intelligent Systems and Computing, vol 396. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2653-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2653-6_1

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2651-2

  • Online ISBN: 978-81-322-2653-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics