research-article

Joint Intent Detection Model for Task-oriented Human-Computer Dialogue System using Asynchronous Training

Authors:
Yirui Wu

College of Computer and Information, Hohai University, Nanjing City, China

College of Computer and Information, Hohai University, Nanjing City, China

0000-0003-3022-3718
View Profile

,
Hao Li

College of Computer and Information, Hohai University, Nanjing City, China

College of Computer and Information, Hohai University, Nanjing City, China

0000-0002-7434-3896
View Profile

,
Lilai Zhang

College of Computer and Information, Hohai University, Nanjing City, China

College of Computer and Information, Hohai University, Nanjing City, China

0000-0003-4324-4016
View Profile

,
Chen Dong

College of Computer and Information, Hohai University, Nanjing City, China

College of Computer and Information, Hohai University, Nanjing City, China

0000-0002-7900-6282
View Profile

,
Qian Huang

College of Computer and Information, Hohai University, Nanjing City, China

College of Computer and Information, Hohai University, Nanjing City, China

0000-0001-5625-0402
View Profile

,
Shaohua Wan

Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen City, China

Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen City, China

0000-0001-7013-9081
View Profile

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 22 Issue 5Article No.: 135pp 1–17https://doi.org/10.1145/3558096

Published:09 May 2023Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

How to accurately understand low-resource languages is the core of the task-oriented human-computer dialogue system. Language understanding consists of two sub-tasks, i.e., intent detection and slot filling. Intent detection still faces challenges due to semantic ambiguity and implicit intentions with users’ input. Moreover, separately modeling intent detection and slot filling significantly decrease the correctness and relevance between questions and answers. To address these issues, we propose a joint intent detection method using asynchronous training strategy. The proposed method firstly encodes local text information extracted by CNN and relationship information among words emphasized by attention structure. Later, a joint intent detection model with asynchronous training strategy is proposed by either fusing hidden states of intent detection and slot filling layers, or adopting the key information to fine-tune the whole network, greatly increasing the relevance of intent detection and slot filling subtasks. The accuracy achieved by the proposed method tested on an open-source airline travel dataset and a self-collected electricity service dataset, i.e., ATIS and ECSF, are 97.49% and 89.68%, respectively, which proves the effectiveness of joint learning and asynchronous training.

REFERENCES

[1] Casanueva Iñigo, Temcinas Tadas, Gerz Daniela, Henderson Matthew, and Vulic Ivan. 2020. Efficient intent detection with dual sentence encoders. CoRR abs/2003.04807.Google Scholar
[2] Chan William, Jaitly Navdeep, Le Quoc V., and Vinyals Oriol. 2016. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 4960–4964.Google ScholarDigital Library
[3] Chen Chen, Jiang Jiange, Zhou Yang, Lv Ning, Liang Xiaoxu, and Wan Shaohua. 2022. An edge intelligence empowered flooding process prediction using Internet of Things in smart city. J. Parallel and Distrib. Comput. 165 (2022), 66–78.Google ScholarCross Ref
[4] Chen Chen, Liu Lei, Wan Shaohua, Hui Xiaozhe, and Pei Qingqi. 2021. Data dissemination for Industry 4.0 applications in Internet of Vehicles based on short-term traffic prediction. ACM Transactions on Internet Technology (TOIT) 22, 1 (2021), 1–18.Google ScholarDigital Library
[5] Cheng Jianpeng, Dong Li, and Lapata Mirella. 2016. Long short-term memory-networks for machine reading. In Proceedings of Empirical Methods in Natural Language Processing. 551–561.Google ScholarCross Ref
[6] Coucke Alice, Saade Alaa, Ball Adrien, Bluche Théodore, Caulier Alexandre, Leroy David, Doumouro Clément, Gisselbrecht Thibault, Caltagirone Francesco, Lavril Thibaut, Primet Maël, and Dureau Joseph. 2018. Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces. CoRR abs/1805.10190.Google Scholar
[7] Dopierre Thomas, Gravier Christophe, and Logerais Wilfried. 2021. ProtAugment: Intent detection meta-learning through unsupervised diverse paraphrasing. In Proceedings of Association for Computational Linguistics. 2454–2466.Google ScholarCross Ref
[8] Haihong E., Niu Peiqing, Chen Zhongfu, and Song Meina. 2019. A novel bi-directional interrelated model for joint intent detection and slot filling. In Proceedings of Association for Computational Linguistics. 5467–5471.Google Scholar
[9] Feng Zixian, Zhu Caihai, et al. 2021. An evaluation of Chinese human-computer dialogue technology. Data Intell. 3, 2, 274–286.Google ScholarCross Ref
[10] Gerz Daniela, Su Pei-Hao, et al. 2021. Multilingual and cross-lingual intent detection from spoken data. In Empirical Methods in Natural Language Processing. 7468–7475.Google Scholar
[11] Goo Chih-Wen, Gao Guang, Hsu Yun-Kai, Huo Chih-Li, Chen Tsung-Chieh, Hsu Keng-Wei, and Chen Yun-Nung. 2018. Slot-gated modeling for joint slot filling and intent prediction. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 753–757.Google ScholarCross Ref
[12] Guo Zhaohan Daniel, Tür Gökhan, Yih Wen-tau, and Zweig Geoffrey. 2014. Joint semantic utterance classification and slot filling with recursive neural networks. In Proceedings of IEEE Spoken Language Technology Workshop. 554–559.Google ScholarCross Ref
[13] Hakkani-Tür Dilek, Tür Gökhan, Celikyilmaz Asli, Chen Yun-Nung, Gao Jianfeng, Deng Li, and Wang Ye-Yi. 2016. Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM. In Proceedings of Annual Conference of the International Speech Communication Association. 715–719.Google ScholarCross Ref
[14] Jeong Minwoo and Lee Gary Geunbae. 2008. Triangular-chain conditional random fields. IEEE Trans. Speech Audio Process. 16, 7, 1287–1302.Google ScholarDigital Library
[15] Kim Yoon. 2014. Convolutional neural networks for sentence classification. In Proceedings of Empirical Methods in Natural Language Processing. 1746–1751.Google ScholarCross Ref
[16] Kurata Gakuto, Xiang Bing, Zhou Bowen, and Yu Mo. 2016. Leveraging sentence-level information with encoder LSTM for semantic slot filling. In Proceedings of Empirical Methods in Natural Language Processing. 2077–2083.Google ScholarCross Ref
[17] Lafferty John D., McCallum Andrew, and Pereira Fernando C. N.. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of International Conference on Machine Learning. 282–289.Google ScholarDigital Library
[18] Larson Stefan, Mahendran Anish, et al. 2019. An evaluation dataset for intent classification and out-of-scope prediction. In Empirical Methods in Natural Language Processing. 1311–1316.Google Scholar
[19] Li Changliang, Li Liang, and Qi Ji. 2018. A self-attentive model with gate mechanism for spoken language understanding. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 3824–3833.Google ScholarCross Ref
[20] Li Xin and Roth Dan. 2006. Learning question classifiers: The role of semantic information. Nat. Lang. Eng. 12, 3, 229–249.Google ScholarDigital Library
[21] Lin Jerry Chun-Wei, Shao Yinan, Djenouri Youcef, and Yun Unil. 2021. ASRNN: A recurrent neural network with an attention model for sequence labeling. Knowl. Based Syst., 106548.Google ScholarCross Ref
[22] Liu Bing and Lane Ian. 2016. Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv preprint arXiv:1609.01454 (2016).Google Scholar
[23] Liu Bing and Lane Ian R.. 2016. Attention-based recurrent neural network models for joint intent detection and slot filling. In Proceedings of International Speech Communication Association. 685–689.Google ScholarCross Ref
[24] Liu Xingkun, Eshghi Arash, Swietojanski Pawel, and Rieser Verena. 2019. Benchmarking natural language understanding services for building conversational agents. In Proceedings of International Workshop on Spoken Dialog System Technology. 165–183.Google Scholar
[25] Luong Thang, Pham Hieu, and Manning Christopher D.. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of Empirical Methods in Natural Language Processing. 1412–1421.Google ScholarCross Ref
[26] Lyu Shengfei and Liu Jiaqi. 2021. Convolutional recurrent neural networks for text classification. J. Database Manag. 32, 4, 65–82.Google ScholarDigital Library
[27] Merdivan Erinc, Singh Deepika, Hanke Sten, and Holzinger Andreas. 2018. Dialogue systems for intelligent human computer interactions. In 1st Workshop on Behavioral Change and Ambient Intelligence for Sustainability and 2nd Workshop on Affective Interaction with Avatars and Robots. 57–71.Google Scholar
[28] Ouyang Yawen, Ye Jiasheng, Chen Yu, Dai Xinyu, Huang Shujian, and Chen Jiajun. 2021. Energy-based unknown intent detection with data manipulation. In Proceedings of Association for Computational Linguistics. 2852–2861.Google ScholarCross Ref
[29] Ravuri Suman V. and Stolcke Andreas. 2015. Recurrent neural network and LSTM models for lexical utterance classification. In Proceedings of International Speech Communication Association. 135–139.Google ScholarCross Ref
[30] Schuster Sebastian, Gupta Sonal, Shah Rushin, and Lewis Mike. 2019. Cross-lingual transfer learning for multilingual task oriented dialog. In Proceedings of North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 3795–3805.Google ScholarCross Ref
[31] Shao Yinan, Lin Jerry Chun-Wei, Srivastava Gautam, Jolfaei Alireza, Guo Dongdong, and Hu Yi. 2021. Self-attention-based conditional random fields latent variables model for sequence labeling. Pattern Recognit. Lett., 157–164.Google ScholarDigital Library
[32] Sutskever Ilya, Vinyals Oriol, and Le Quoc V.. 2014. Sequence to sequence learning with neural networks. In Proceedings of Neural Information Processing Systems. 3104–3112.Google Scholar
[33] Wang Jixuan, Wei Kai, Radfar Martin, Zhang Weiwei, and Chung Clement. 2021. Encoding syntactic knowledge in transformer encoder for intent detection and slot filling. In Proceedings of AAAI Conference on Artificial Intelligence. 13943–13951.Google ScholarCross Ref
[34] Wang Yu. 2017. A new concept using LSTM neural networks for dynamic system identification. In Proceedings of America Control Conference. 5324–5329.Google ScholarCross Ref
[35] Wang Yu, Shen Yilin, and Jin Hongxia. 2018. A bi-model based RNN semantic frame parsing model for intent detection and slot filling. arXiv preprint arXiv:1812.10235 (2018).Google Scholar
[36] Wu Yirui, Guo Haifeng, Chakraborty Chinmay, Khosravi Mohammad, Berretti Stefano, and Wan Shaohua. 2022. Edge computing driven low-light image dynamic enhancement for object detection. IEEE Transactions on Network Science and Engineering (2022).Google ScholarCross Ref
[37] Wu Yirui, Liu Wenxiang, and Wan Shaohua. 2021. Multiple attention encoded cascade R-CNN for scene text detection. Journal of Visual Communication and Image Representation 80 (2021), 103261.Google ScholarDigital Library
[38] Wu Yirui, Ma Yuntao, and Wan Shaohua. 2021. Multi-scale relation reasoning for multi-modal Visual Question Answering. Signal Processing: Image Communication 96 (2021), 116319.Google ScholarCross Ref
[39] Xia Congying, Zhang Chenwei, Yan Xiaohui, Chang Yi, and Yu Philip S.. 2018. Zero-shot user intent detection via capsule neural networks. In Proceedings of Empirical Methods in Natural Language Processing. 3090–3099.Google ScholarCross Ref
[40] Xu Puyang and Sarikaya Ruhi. 2013. Convolutional neural network based triangular CRF for joint intent detection and slot filling. In Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding. 78–83.Google ScholarCross Ref
[41] Xu Weijia, Haider Batool, and Mansour Saab. 2020. End-to-end slot alignment and recognition for cross-lingual NLU. In Empirical Methods in Natural Language Processing. 5052–5063.Google Scholar
[42] Yao Kaisheng, Peng Baolin, Zhang Yu, Yu Dong, Zweig Geoffrey, and Shi Yangyang. 2014. Spoken language understanding using long short-term memory neural networks. In Proceedings of IEEE Spoken Language Technology. 189–194.Google ScholarCross Ref
[43] Zhan Li-Ming, Liang Haowen, Liu Bo, Fan Lu, Wu Xiao-Ming, and Lam Albert Y. S.. 2021. Out-of-scope intent detection with self-supervision and discriminative training. In Proceedings of Association for Computational Linguistics. 3521–3532.Google ScholarCross Ref
[44] Zhang Xiaodong and Wang Houfeng. 2016. A joint model of intent determination and slot filling for spoken language understanding. In Proceedings of International Joint Conference on Artificial Intelligence. 2993–2999.Google Scholar
[45] Zhu Su and Yu Kai. 2017. Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 5675–5679.Google ScholarDigital Library

Index Terms

Joint Intent Detection Model for Task-oriented Human-Computer Dialogue System using Asynchronous Training
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Speech recognition

Recommendations

Historical Information-Based Intent Detection for Multiturn Dialogue
ICCAI '22: Proceedings of the 8th International Conference on Computing and Artificial Intelligence

Intent detection aims to determine the intent of users, an important task in natural language processing and dialogue systems. As one of the key modules of task-based dialogue systems, intent detection directly influences the meaning analysis of spoken ...
Read More
Multitask learning for multilingual intent detection and slot filling in dialogue systems
Abstract
Dialogue systems are becoming an ubiquitous presence in our everyday lives having a huge impact on business and society. Spoken language understanding (SLU) is the critical component of every goal-oriented dialogue system or any ...
Highlights
- We propose a multilingual multitask approach to fuse the two primary SLU tasks.
Read More
Training a Dialogue Act Tagger for human-human and human-computer travel dialogues
SIGDIAL '02: Proceedings of the 3rd SIGdial workshop on Discourse and dialogue - Volume 2

While dialogue acts provide a useful schema for characterizing dialogue behaviors in human-computer and human-human dialogues, their utility is limited by the huge effort involved in hand-labelling dialogues with a dialogue act labelling scheme. In this ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Asian and Low-Resource Language Information Processing Volume 22, Issue 5
May 2023
653 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3596451
Editor:
Imed Zitouni
Google, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 May 2023
- Online AM: 22 August 2022
- Accepted: 9 August 2022
- Revised: 1 August 2022
- Received: 31 December 2021
Published in tallip Volume 22, Issue 5

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Intent detection
task-oriented human-computer dialogue system
joint modeling
asynchronous training
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 454
  Total Downloads
- Downloads (Last 12 months)311
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

Joint Intent Detection Model for Task-oriented Human-Computer Dialogue System using Asynchronous Training

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Historical Information-Based Intent Detection for Multiturn Dialogue

Multitask learning for multilingual intent detection and slot filling in dialogue systems

Training a Dialogue Act Tagger for human-human and human-computer travel dialogues

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

Caption

Joint Intent Detection Model for Task-oriented Human-Computer Dialogue System using Asynchronous Training

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Historical Information-Based Intent Detection for Multiturn Dialogue

Multitask learning for multilingual intent detection and slot filling in dialogue systems

Training a Dialogue Act Tagger for human-human and human-computer travel dialogues

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

Share this Publication link

Share on Social Media