Design and Development of Marathi Speech Interface System

Gaikwad, Santosh; Gawali, Bharti; Mehrotra, Suresh

doi:10.1007/978-81-322-2653-6_1

Santosh Gaikwad¹⁸,
Bharti Gawali¹⁸ &
Suresh Mehrotra¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 396))

747 Accesses
1 Citations

Abstract

Speech is the most prominent and natural form of communication between humans. It has potential of being an important mode of interaction with computer. Man–machine interface has always been proven to be a challenging area in natural language processing and in speech recognition research. There are growing interests in developing machines that can accept speech as input. Normal person generally communicate with the computer through a mouse or keyboard. It requires training and hard work as well as knowledge about computer, which is a limitation at certain levels. Marathi is used as official language at government of Maharashtra. There is a need for developing systems that enable human–machine interaction in Indian regional languages. The objective of this research is to design and development of the Marathi speech Activated Talking Calculator (MSAC) as an interface system. The MSAC is speaker-dependent speech recognition system that is used to perform basic mathematical operation. It can recognize isolated spoken digit from 0 to 50 and basic operation like addition, subtraction, multiplication, start, stop, equal, and exit. Database is an essential requirement to design the speech recognition system. To reach up to the objectives set, a database having 22,320 sizes of vocabularies is developed. The MSAC system trained and tested using the Mel Frequency Cepstral Coefficients (MFCC), Linear Discriminative Analysis (LDA), Principal Component Analysis (PCA), Linear Predictive Codding (LPC), and Rasta-PLP individually. Training and testing of MSAC system are done with individually Mel Frequency Linear Discriminative Analysis (MFLDA), Mel Frequency Principal Component Analysis (MFPCA), Mel Frequency Discrete Wavelet Transformation (MFDWT), and Mel Frequency Linear Discrete Wavelet Transformation (MFLDWT) fusion feature extraction techniques. This experiment is proposed and tested the Wavelet Decomposed Cepstral Coefficient (WDCC) with 18, 36, and 54 coefficients approach. The performance of MSAC system is calculated on the basis of accuracy and real-time factor (RTF). From the experimental results, it is observed that the MFCC with 39 coefficients achieved higher accuracy than 13 and 26 variations. The MFLDWT is proven higher accuracy than MFLDA, MFPCA, MFDWT, and Mel Frequency Principal Discrete Wavelet Transformation (MFPDWT). From this research, we recommended that WDCC is robust and dynamic techniques than MFCC, LDA, PCA, and LPC. MSAC interface application is directly beneficial for society people for their day to day activity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

A review on speech recognition technique. Int. J. Comput. Appl. 10(3), 0975–8887 (2010)
Google Scholar
Picheny, M.: Large vocabulary speech recognition 35(4):42–50 (2002)
Google Scholar
Arokia Raj, A., Susmitha, R.C.: A voice interface for the visually impaired. In: 3rd International Conference: Sciences of Electronic, Technologies of Information and Telecommunications March 27–31, Tunisia (2005)
Google Scholar
Roux, J.C., Botha, E.C., Du Preez, J.A.: Developing a multilingual telephone based information system in African languages. In: Proceedings of the Second International Language Resources and Evaluation Conference, no. 2, pp. 975–980. ELRA, Athens (2000)
Google Scholar
Robertson, J., Wong, Y.T., Chung, C., Kim, D.K.: Automatic speech recognition for generalized time based media retrieval and indexing. In: Proceedings of the Sixth ACM International Conference on Multimedia, pp. 241–246. Bristol (1998)
Google Scholar
Scan soft: Embedded speech solutions. http://www.speechworks.com/ (2004). Accessed 25 Jan 2013
Kandasamy, S.: Speech recognition systems. SURPRISE J. 1(1) (1995)
Google Scholar
Dusan, S., Rabiner, L.R.: On integrating insights from human speech perception into automatic speech recognition. In: Proceedings of INTERSPEECH 2005. Lisbon (2005)
Google Scholar
Shrawankar, U., Thakare, V.: Speech user interface for computer based education system. In: International Conference on Signal and Image Processing (ICSIP), pp. 148–152 (2010) (15–17 Dec)
Google Scholar
Alt, F.L., Rubinoff, M., Yovitts, M.C.: Advances in Computers, pp. 165–230. Academic Press, New York
Google Scholar
Rebman Jr., C.M., Aiken, M.W., Cegielski, C.G.: Speech Recognition in the Human–Computer Interface, vol. 40, Issue 6, pp. 509–519, Information & Management. Elsevier (2003)
Google Scholar
Furui, S.: 50 Years of progress in speech and speaker recognition research. ECTI Trans. Comput. Inf. Technol. 1(2) (2005)
Google Scholar
Nehe, N.S., Holambe, R.S.: New feature extraction techniques for Marathi digit recognition. Int. J. Recent Trends Eng. 2(2) (2009)
Google Scholar
Bhosale, R.S.: Enhanced speech recognition using ADAG SVM approach. Int. J. Emerg. Trends Technol. Comput. Sci. (IJETTCS) 1(4) (2012)
Google Scholar
Anumanchipalli, G., Chitturi, R., Joshi, S., Kumar, R., Singh, S.P., Sitaram, R.N.V., Kishore, S.P.: Development of indian language speech databases for large vocabulary speech recognition systems. In: Proceedings of International Conference on Speech and Computer (SPECOM). Patras (2005)
Google Scholar
Neti, C., Rajput, N., Verma, A.: A large vocabulary continuous speech recognition system for Hindi. In: Proceedings of the National conference on Communications, pp. 366–370. Mumbai (2002)
Google Scholar
Gawali, B.W., Gaikwad, S., Yannawar, P., Mehrotra, S.C.: Marathi Isolated Word Recognition System using MFCC and DTW Features. ACEEE (2010)
Google Scholar
Chakraborty, K., Talele, A., Upadhya, S.: Voice recognition using MFCC algorithm. Int. J. Innovative Res. Adv. Eng. (IJIRAE) 1(10) (2014). ISSN: 2349-2163
Google Scholar
Patel, K., Prasad, R.K.: Speech recognition and verification using MFCC & VQ. Int. J. Emerg. Sci. Eng. (IJESE) 1(7) (2013). ISSN: 2319–6378
Google Scholar
Oh-Wook Kwon, Chan, K., Lee, T.-W.: Speech feature analysis using variational bayesian PCA. IEEE Signal Process. Lett. 10, 137–140 (2003)
Google Scholar
Gaikwad, S., Gawali, B., Mehrotra, S.C.: Novel Approach Based Feature Extraction For Marathi Continuous Speech Recognition, pp. 795–804. ACM Digital Library, New York (2012). ISBN: 978-1-4503-1196-0/2012
Google Scholar
Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE Trans. Speech Audio Process. 2, 578–589 (1994). doi:10.1109/89.326616
Article Google Scholar
Ali, H., Ahmad, N., Zhou, X., Iqbal, K., Muhammad Ali, S.: DWT features performance analysis for automatic speech recognition of Urdu. SpringerPlus 3:204 (2014) doi:10.1186/2193-1801-3-204
Tiwari, A., Zadgaonkar, A.S.: Debauchee’s wavelet analysis of speech signal of different speakers for similar speech set. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 4(8) (2014)
Google Scholar
Mallat, S.: A Wavelet Tour of Signal Processing. Academic Press (1998)
Google Scholar
Chan, Y.T.: Wavelet Basics. Kulwer Academic Publications (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

System Communication Machine Learning Research Laboratory (SCM-RL), Department of Computer Science and Information Technology, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, Maharashtra, India
Santosh Gaikwad, Bharti Gawali & Suresh Mehrotra

Authors

Santosh Gaikwad
View author publications
You can also search for this author in PubMed Google Scholar
Bharti Gawali
View author publications
You can also search for this author in PubMed Google Scholar
Suresh Mehrotra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Santosh Gaikwad .

Editor information

Editors and Affiliations

A.K. Choudhury School of Information Technology, University of Calcutta, Kolkata, West Bengal, India
Rituparna Chaki
Computer Science, DAIS—Università Ca’ Foscari, Venice, Italy
Agostino Cortesi
Faculty of Computer Science, Bialystok University of Technology, Białystok, Poland
Khalid Saeed
Computer Science & Engineering, University of Calcutta, Kolkata, West Bengal, India
Nabendu Chaki

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gaikwad, S., Gawali, B., Mehrotra, S. (2016). Design and Development of Marathi Speech Interface System. In: Chaki, R., Cortesi, A., Saeed, K., Chaki, N. (eds) Advanced Computing and Systems for Security. Advances in Intelligent Systems and Computing, vol 396. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2653-6_1

Download citation

DOI: https://doi.org/10.1007/978-81-322-2653-6_1
Published: 19 November 2015
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2651-2
Online ISBN: 978-81-322-2653-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics