research-article

Multimodal Dialog System: Relational Graph-based Context-aware Question Understanding

Authors:

Zan Gao,

Liqiang NieAuthors Info & Claims

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Pages 695 - 703

https://doi.org/10.1145/3474085.3475234

Published: 17 October 2021 Publication History

Get Access

Abstract

Multimodal dialog system has attracted increasing attention from both academia and industry over recent years. Although existing methods have achieved some progress, they are still confronted with challenges in the aspect of question understanding (i.e., user intention comprehension). In this paper, we present a relational graph-based context-aware question understanding scheme, which enhances the user intention comprehension from local to global. Specifically, we first utilize multiple attribute matrices as the guidance information to fully exploit the product-related keywords from each textual sentence, strengthening the local representation of user intentions. Afterwards, we design a sparse graph attention network to adaptively aggregate effective context information for each utterance, completely understanding the user intentions from a global perspective. Moreover, extensive experiments over a benchmark dataset show the superiority of our model compared with several state-of-the-art baselines.

Supplementary Material

MP4 File (MM21-fp0411.mp4)

Multimodal dialog systems have attracted increasing research interest, due to their significance in retail, travel, and other domains. Although existing methods have achieved some progress, they are still confronted with challenges in the aspect of user intention comprehension. Toward this end, we present a relational graph-based context-aware question understanding scheme, which enhances the user intention comprehension from local to global. Concretely, we utilize multiple attribute matrices as the guidance information to fully exploit the product-related keywords from each textual sentence, strengthening the local representation of user intentions. Besides, we design a sparse graph attention network to adaptively aggregate effective context information for each utterance, completely understanding the user intentions from a global perspective. Moreover, extensive experiments over a benchmark dataset show the superiority of our model compared with several state-of-the-art baselines.

Download
172.07 MB

References

[1]

Diederik P. Kingma andJimmy Ba. 2015. Adam: A method for stochastic optimization. In The International Conference on Learning Representations. 1--15.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Aspect-Aware Response Generation for Multimodal Dialogue System

Knowledge-aware Multimodal Dialogue Systems

Construction of Multimodal Dialog System via Knowledge Graph in Travel Domain

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations