ABSTRACT
The core problem of the human-machine dialogue system in the localized command and control system is to understand the information of the dialogue, and then give a reasonable answer that is satisfactory to the command system commander or assist the commander in completing a certain task. Facing huge interactive dialogue and other short text dialogue data, it is necessary to understand it efficiently and accurately. Daily interactive dialogues are carried out following certain themes, and mining the hidden subject information is a good way to grasp the inherent characteristics of the dialogue. The length of the dialogue text in the daily dialogue of the commander is small, and it has a high degree of randomness. The oral language is serious. Its themes are intertwined and the organizational structure is more chaotic than the news and other types of texts, which makes it difficult for the traditional topic model to capture the hidden words. The co-occurrence law of the contained words and documents cannot be directly applied to this type of text. Therefore, this article constructs a "pseudo-long document" method for short text problems. The topic information in a group of dialogues is relatively similar, and the word co-occurrence in the dialogue is more representative. Using this method to construct the training corpus of the LDA model can improve the topic model's capture of the document-word co-occurrence law to a certain extent. At the same time, this paper proposes an index based on the combination of perplexity and topic similarity to determine the optimal number of topics. Based on these two aspects, an interactive dialogue topic model is constructed.
- Tian Z, Yan R, Mou L, How to Make Context More Useful? An Empirical Study on Context-Aware Neural Conversational Models[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2017, 2: 231-236.Google ScholarCross Ref
- Li J, Galley M, Brockett C, A Diversity-Promoting Objective Function for Neural Conversation Models[J]. north american chapter of the association for computational linguistics, 2016: 110-119.Google Scholar
- Yao K, Peng B, Zweig G, An Attentional Neural Conversation Model with Improved Specificity.[J]. arXiv: Computation and Language, 2016.Google Scholar
- Nakamura R, Sudoh K, Yoshino K, Another Diversity-Promoting Objective Function for Neural Dialogue Generation[J]. arXiv: Computation and Language, 2018.Google Scholar
- Vijayakumar AK, Cogswell M, Selvaraju RR, Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models[J]. arXiv: Artificial Intelligence, 2017.Google Scholar
- Shao L, Gouws S, Britz D, Generating Long and Diverse Responses with Neural Conversation Models[J]. 2017.Google Scholar
Recommendations
Topic-Oriented Dialogue Summarization
A multi-turn dialogue often contains multiple discussion topics. In several scenarios (e.g., customer service dispute, public opinion monitoring), people are only interested in the gist of a specific topic in the dialogue. Therefore, we propose a novel ...
Research on Multi-document Summarization Based on LDA Topic Model
IHMSC '14: Proceedings of the 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02Compared with VSM (Vector Space Model) and graph-ranking models, LDA (Latent Dirichlet Allocation) Model can discover latent topics in the corpus and latent topics are beneficial to use sentence-ranking mechanisms to form a good summary. In the paper, ...
Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementSentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Comments