Abstract:
Multi-intent spoken language understanding (SLU) that can handle an utterance containing multiple intents is more practical and attracts increasing attention. However, ex...Show MoreMetadata
Abstract:
Multi-intent spoken language understanding (SLU) that can handle an utterance containing multiple intents is more practical and attracts increasing attention. However, existing state-of-the-art models are either too coarse-grained (Utterance-level) or too fine-grained (Token-level) in intent detection, and thus may fail to recognize the intent transition point and the correct intents in an utterance. In this paper, we propose a Chunk-Level Intent Detection (CLID) framework, where we introduce a sliding window-based self-attention (SWSA) scheme for regional chunk intent detection. Based on the SWSA, an auxiliary task is introduced to identify the intent transition point in an utterance and obtain sub-utterances with a single intent. The intent of each sub-utterance is then predicted by assembling the intent predictions of the chunks (in a sliding window manner) within it. We conduct experiments on two public datasets, MixATIS and MixSNIPS, and the results show that our model achieves state-of-the-art performance.
Published in: IEEE Signal Processing Letters ( Volume: 29)