Abstract:
With the rapid development of Internet and mobile devices, a vast number of short texts are produced by users, which also post great challenges to topic modeling because ...Show MoreMetadata
Abstract:
With the rapid development of Internet and mobile devices, a vast number of short texts are produced by users, which also post great challenges to topic modeling because of the severe sparsity in context. The traditional topic model cannot do well in short text because of lacking word co-occurrence patterns. An effective approach bi-term topic model(BTM) has been proposed which models the word co-occurrence at the whole corpus directly and performs better than conventional topic models. However, BTM only consider the frequency of bi-term simply and ignore the latent semantic information between bi-terms which cause the words with similar semantic having a great risk of being grouped under different topic. In this paper, we propose a latent semantic augmented bi-term topic model(LS-BTM) which incorporates semantic information as prior knowledge to infer the topic more reasonable. The experimental result shows that our model gets better result than other short text topic models over real-world dataset.
Published in: 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS)
Date of Conference: 23-25 November 2018
Date Added to IEEE Xplore: 14 April 2019
ISBN Information: