Abstract
Speech-recognition technology was applied to collect agricultural-price information. In this paper, we propose a robust continuous Mandarin speech-recognition method suitable for environments where agricultural product prices are acquired. To mitigate the decrease in recognition rate caused by the mismatch between training and real tests, we developed acoustic models based on the Hidden Markov Model (HMM) and trained the models by collecting data in different environments. The results showed that the recognition performance of triphone models was superior to that of monophone models. Both male and female HMMs performed better than the male and female mixed acoustic models. Although the decision-tree clustering method could not significantly improve the recognition rate, it evidently would reduce the quantity of triphone models. Gaussian mixture components improved the recognition rate on one hand, but they increased the calculation tasks on the other hand. The cepstral mean normalization and cepstral variance normalization methods significantly improved the identification-system performance. Under different locations and different speaker tests, the methods we used demonstrated varying degrees of improvement in recognition performance. The ultimate recognition rates were 95.04% for the males and 97.62% for the females. Speech-recognition technology can possibly be applied to collection of agricultural-price information. The experimental results showed that the models trained by these methods exhibited good recognition performance. Furthermore, the approach adopted by our research lays the foundation for the development of an application system in the future.
Similar content being viewed by others
References
Cao, Y., Teng, G., Yu, L., & Li, Q. (2014). Comparison of different de-noising methods in vocalization environment of laying hens including fan noise. Nongye Gongcheng Xuebao/transactions of the Chinese Society of Agricultural Engineering, 30(2), 212–218. (in Chinese).
Chedad, A., Moshou, D., Aerts, J. M., Van Hirtum, A., Ramon, H., & Berckmans, D. (2001). Recognition system for pig cough based on probabilistic neural networks. Journal of Agricultural Engineering Research, 79(4), 449–457. https://doi.org/10.1006/jaer.2001.0719.
Chen, C. P., & Bilmes, J. A. (2006). Mva processing of speech features. IEEE Transactions on Audio Speech & Language Processing, 15(1), 257–270. https://doi.org/10.1109/TASL.2006.876717.
Dai, J. G., Wang, K. R., Li, S. K., Li, S. M., & Wang, Q. (2012). Designing and implementation of crop production management information system based on state-operated farm. Scientia Agricultura Sinica, 45(11), 2159–2167. (in Chinese).
Gao, S., Bo, X. U., & Huang, T. (2000). Ttiphone models for mandarin speech recognition based on decision tree. Acta Acustica, 25(06), 504–509. (in Chinese).
Guarino, M., Jans, P., Costa, A., Aerts, J. M., & Berckmans, D. (2008). Field test of algorithm for automatic cough detection in pig houses. Computers & Electronics in Agriculture, 62(1), 22–28. https://doi.org/10.1016/j.compag.2007.08.016.
Gurpreet, S., Akhil, J., & Priyam, C. (2010). Multi utility E-controlled cum voice operated farm vehicle. International Journal of Computer Applications, 1(13), 109–113.
Lee, K. F., Hon, H. W., & Reddy, R. (1990). An overview of the sphinx speech recognition system. IEEE Transactions on Acoustics Speech & Signal Processing, 38(1), 35–45. https://doi.org/10.1109/29.45616.
Li, C., & Wang, Z. Y. (2003). A new acoustic modeling of inter-syllable context-dependent units for Putonghua continuous speech recognition. Acta Acustica, 28(2), 187–191. (in Chinese).
Li, J., Fang, Z., & Zhang, J. (2004). Context dependent initial/final acoustic modeling for continuous Chinese speech recognition. Journal of Tsinghua University, 44(1), 61–64. (in Chinese).
Li, Y. G., Pu, F. A., & Zheng, F. (2012). Statistical threshholding for robust ASR. Journal of ChongQing University of Posts and Telecommunications: Natural Science Edition, 24(02), 127–132.
Mantena, G. V., Rajendran, S., Rambabu, B., & Gangashetty, S. V. (2011). A speech-based conversation system for accessing agriculture commodity prices in Indian languages. The Workshop on Hands-Free Speech Communication & Microphone Arrays, IEEE. pp. 153–154.
Mohan, A., Rose, R., Ghalehjegh, S. H., & Umesh, S. (2014). Acoustic modelling for speech recognition in Indian languages in an agricultural commodities task domain. Speech Communication, 56(1), 167–180. https://doi.org/10.1016/j.specom.2013.07.005.
Ni, C. J., Liu, W. J., & Xu, B. (2009). Research on large vocabulary continuous speech recognition for mandarin Chinese. Journal of Chinese Information Processing, 23(01), 112–123. (in Chinese).
Ou, W., Gao, W., Li, Z., & Zhang, S. (2010). Application of keywords speech recognition in agricultural voice information system. 2nd International Conference on Computational Intelligence and Natural Computing Proceedings. Vol. 2, pp. 197–200.
Peng, D., Liu, G., & Guo, J. (2007). Study on acoustic modeling in a mandarin continuous speech recognition. International Journal of Mining Science and Technology, 17(1), 143–146. https://doi.org/10.1016/S1006-1266(07)60030-3.
Plauche, M., Nallasamy, U., Pal, J., Wooters, C., & Ramachandran, D. (2007). Speech Recognition for Illiterate Access to Information and Technology. International Conference on Information and Communication Technologies and Development. Vol. 67, pp. 83–92.
Qi, Y. H., Pan, F. P., Ge, F. P., & Yan, Y. H. (2013). Refining triphone model in mandarin continuous speech recognition. Application Research of Computers, 30(10), 2920–2922. (in Chinese).
Qian, J. P., Wu, X. M., Fan, B. L., Yang, X. T., Sun, L. X., & Chen, M. (2013). A solution for improving vegetable circulation traceability precision based on barcode-rfid correspondence. Scientia Agricultura Sinica, 46(18), 3857–3863. (in Chinese).
Tian, W. J., Shen, C. J., Zheng, W. G., Zhang, S. R., & Zhou, G. H. (2012). Design and implementation of agricultural products price information acquisition and early warning system. Computer Engineering and Design, 33(5), 1816–1821. (in Chinese).
Xiao, Y. P., & Ye, W. P. (2010). Survey of feature normalization techniques for robust speech recognition. Journal of Chinese Information Processing, 24(05), 106–116. (in Chinese).
Xu, S. W., Zhang, Y. E., LI, Z. Q., Li, Z. M., & Kong, F. T. (2011). Research on standard and classification coding system of holographic information of agricultural products market. Food and Nutrition in China, 17(12), 5–8. (in Chinese).
Xu, X. H., Zhu, J., & Guo, Q. (2004). A Hierarchical clustering algorithm in continuous mandarin speech recognition. Signal Processing, 20(05), 497–500. (in Chinese).
Yao, X., Luo, M., & Yang, G. Q. (2012). Research and design of pen-based interaction agricultural information collection and dissemination system. Computer and Mordenlization, 4, 71–75. (in Chinese).
Zhang, S. R., Zheng, W. G., Shen, C. J., & Xing, Z. (2012). Agricultural product price information collection terminal of embedded portable wireless. Computer Engineering and Design, 33(2), 514–518. (in Chinese).
Zhao, C. J., Shen, C. J., Xing, Z., Zheng, W. G., Bao, F., & Wu, W. B. (2011). A device and method of agricultural product information acquisition. China Patent, CN102122430A.
Zhao, L., Wang, H., Zhan, Z., & Kong, X. (2008). Research advances in insect acoustic signals and their applications. Plant Protection, 34(4), 5–12. (in Chinese).
Zhu, L. Q., & Zhang, Z. (2012). Automatic recognition of insect sounds using mfcc and gmm. Acta Entomologica Sinica, 55(4), 466–471. (in Chinese).
Acknowledgements
The research was funded by the National Natural Science Foundation of China (61271364) and the High-level Talents Fund of Qingdao Agricultural University (663/1116022), and A Project of Shandong Province Higher Educational Science and Technology Program (J17KA154). The authors would like to thank the speakers and volunteers for their recorded sound files during the data-acquisition process.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xu, J., Zhu, Y., Xu, P. et al. Agricultural price information acquisition using noise-robust Mandarin auto speech recognition. Int J Speech Technol 21, 681–688 (2018). https://doi.org/10.1007/s10772-018-9532-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-018-9532-7