Abstract
This paper presents a lightweight machine learning model and a fast conjunction matching method to the problem of identifying user intents behind their spoken text commands. These model and method were integrated into a mobile virtual assistant for Vietnamese (VAV) to understand what mobile users mean to carry out on their smartphones via their commands. User intent, in the scope of our work, is an action associated with a particular mobile application. Given an input spoken command, its application will be identified by an accurate classifier while the action will be determined by a flexible conjunction matching algorithm. Our classifier and conjunction matcher are very compact in order that we can store and execute them right on mobile devices. To evaluate the classifier and the matcher, we annotated a medium-sized data set, conducting various experiments with different settings, and achieving impressive accuracy for both the application and action identification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Microsoft Skype Translator and AT&T Speech-to-Speech Translation.
- 2.
Wit.ai: https://wit.ai.
- 3.
References
Angelov, K., Bringert, B., Ranta, A.: Speech-enabled hybrid multilingual translation for mobile devices. In: EACL (2014)
Bastianelli, E., Castellucci, G., Croce, D., Basili, R., Nardi, D.: Effective and robust NLU for human-robot interaction. In: ECAI, vol. 263, pp. 57–62 (2014)
Berger, A.L., Pietra, V.J.D., Pietra, S.A.D.: A maximum entropy approach to natural language processing. Comput. Linguist. 22(1), 39–71 (1996)
Borthwick, A.: A maximum entropy approach to named entity recognition. Ph.D. dissertation, Deptartment of CS, New York University (1999)
Branavan, S.R.K., Chen, H., Zettlemoyer, L.S., Barzilay, R.: Reinforcement learning for mapping instructions to actions. In: ACL/IJCNLP, pp. 82–90 (2009)
Branavan, S.R.K., Zettlemoyer, L.S., Barzilay, R.: Reading between the lines: learning to map high-level instructions to commands. In: ACL, pp. 1268–1277 (2010)
Bratman, M.: Intention, Plans, and Practical Reason. Harvard University Press, Cambridge (1987)
Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: ICML (2014)
Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., Coates, A., Ng, A.Y.: Deep Speech: scaling up end-to-end speech recognition (2014). arxiv.org/abs/1412.5567v2
Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29, 82–97 (2012)
Liu, D., Nocedal, J.: On the limited memory BFGS method for large-scale optimization. Math. Program. 45, 503–528 (1989)
Popkin, J.: Google, apple siri and IBM watson: the future of natural-language question answering in your enterprise. Gartner Technical Professional Advice (2013)
Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: EMNLP, vol.1, pp. 133–142 (1996)
Tellex, S., Kollar, T., Dickerson, S., Walter, M.R., Banerjee, A.G., Teller, S., Roy, N.: Understanding natural language commands for robotic navigation and mobile manipulation. In: AAAI (2011)
Tur, G., Mori, R.D.: Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. Wiley, New York (2011)
Acknowledgment
This work was supported by the project QG.15.29 from Vietnam National University, Hanoi (VNU).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ngo, TL. et al. (2016). Identifying User Intents in Vietnamese Spoken Language Commands and Its Application in Smart Mobile Voice Interaction. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, TP. (eds) Intelligent Information and Database Systems. ACIIDS 2016. Lecture Notes in Computer Science(), vol 9621. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49381-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-662-49381-6_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-49380-9
Online ISBN: 978-3-662-49381-6
eBook Packages: Computer ScienceComputer Science (R0)