Skip to main content

Advertisement

Log in

Evolution analysis of online topics based on ‘word-topic’ coupling network

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Analyzing topic evolution is an effective way to monitor the overview of topic spreading. Existing methods have focused either on the intensity evolution of topics along a timeline or the topic evolution path of technical literature. In this paper, we aim to study topic evolution from a micro perspective, which not only captures the topic timeline but also reveals the topic status and the directed evolutionary path among topics. Firstly, we construct a word network by co-occurrence relationship between feature words. Secondly, Latent Dirichlet allocation (LDA) model is used to automatically extract topics and capture the mapping relationship between words and topics, and then a ‘word-topic’ coupling network is built. Thirdly, based on the ‘word-topic’ coupling network, we describe the topic intensity evolution over time and measure topic status considering the contribution of feature words to a topic. The concept of topic drifting probability is proposed to identify the evolutionary path. Experimental results conducted on two real-world data sets of “COVID-19” demonstrate the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Blei, D., & Lafferty, J. (2006a). Correlated Topic Models. Neural Information Processing Systems, 18, 147.

    Google Scholar 

  • Blei, D. M., & Lafferty, J. D. (2006b). Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning. 113–120

  • Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.

    Article  Google Scholar 

  • Callon, M., Courtial, J. P., & Laville, F. (1991). Co-word analysis as a tool for describing the network of interactions between basic and technological research: The case of polymer chemsitry. Scientometrics, 22(1), 155–205.

    Article  Google Scholar 

  • Chen, B., Tsutsui, S., Ding, Y., & Ma, F. (2017). Understanding the topic evolution in a scientific domain: An exploratory study for the field of information retrieval. Journal of Informetrics, 11(4), 1175–1189.

    Article  Google Scholar 

  • Chen, J., Gong, Z., & Liu, W. (2019). A nonparametric model for online topic discovery with word embeddings. Information Sciences, 504, 32–47.

    Article  MathSciNet  Google Scholar 

  • Chen, W., Lin, C., Li, J., & Yang, Z. (2018). Analysis of the evolutionary trend of technical topics in patents based on lda and hmm: Taking marine diesel engine technology as an example. Journal of the China Society for Entific and Technical Information, 37, 731–742.

    Google Scholar 

  • Du, Y., Yi, Y., Li, X., Chen, X., Fan, Y., & Su, F. (2020). Extracting and tracking hot topics of micro-blogs based on improved latent dirichlet allocation. Engineering Applications of Artificial Intelligence, 87, 103279.

    Article  Google Scholar 

  • Fang, M., Chen, Y., Gao, P., Zhao, S., & Zheng, S. (2014). Topic trend prediction based on wavelet transformation. In 2014 11th Web Information System and Application Conference. 157–162. IEEE

  • Gao, W., Peng, M., Wang, H., Zhang, Y., Han, W., Hu, G., & Xie, Q. (2020). Generation of topic evolution graphs from short text streams. Neurocomputing, 383, 282–294.

    Article  Google Scholar 

  • Hofmann, T. (2001). Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42(1–2), 177–196.

    Article  Google Scholar 

  • Hurtado, J. L., Agarwal, A., & Zhu, X. (2016). Topic discovery and future trend forecasting for texts. Journal of Big Data, 3(1), 7.

    Article  Google Scholar 

  • Jacomy, M., Venturini, T., Heymann, S., & Bastian, M. (2014). ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PLoS ONE, 9(6), e98679.

    Article  Google Scholar 

  • Jian, F., Yajiao, W., & Yuanyuan, D. (2018). Microblog topic evolution computing based on LDA algorithm. Open Physics, 16(1), 509–516.

    Article  Google Scholar 

  • Jung, S., & Yoon, W. C. (2020). An alternative topic model based on common interest authors for topic evolution analysis. Journal of Informetrics, 14(3), 101040.

    Article  Google Scholar 

  • Kim, S., Park, H., & Lee, J. (2020). Word2vec-based latent semantic analysis (W2V-LSA) for topic modeling: a study on blockchain technology trend analysis. Expert Systems with Applications, 152, 113401.

    Article  Google Scholar 

  • Liu, W., Deng, Z. H., Gong, X., Jiang, F., & Tsang, I. W. (2015). Effectively predicting whether and when a topic will become prevalent in a social network. In Proceedings of the National Conference on Artificial Intelligence

  • Liu, Z., Wang, X., & Bai, R. (2017). Construction and empirical research on multi-dimensional topic evolution analysis model. Information Studies: Theory & Application, 3, 18.

    Google Scholar 

  • Lopez, C. E., & Gallemore, C. (2021). An augmented multilingual Twitter dataset for studying the COVID-19 infodemic. Social Network Analysis and Mining, 11(1), 1–14.

    Article  Google Scholar 

  • Manning, C. D., Schütze, H., & Raghavan, P. (2008). Introduction to information retrieval. Cambridge University Press.

    Book  Google Scholar 

  • Miao, Z., Du, J., Dong, F., Liu, Y., & Wang, X. (2020). Identifying technology evolution pathways using topic variation detection based on patent data: A case study of 3D printing. Futures, 118, 102530.

    Article  Google Scholar 

  • Song, Y., Li, A., & Quan, Y. (2018). Topics' popularity prediction based on ARMA model. In Proceedings of 2018 International Conference on Mathematics and Artificial Intelligence. 68–72

  • Stein, B., & Zu Eissen, S. M. (2004). Topic identification: Framework and application. In Proceedings of the International Conference on Knowledge Management. 522–531

  • Stevens, K., Kegelmeyer, P., Andrzejewski, D., & Buttler, D. (2012, July). Exploring topic coherence over many models and many topics. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 952–961

  • Wang, X., & McCallum, A. (2006). Topics over time: a non-Markov continuous-time model of topical trends. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. 424–433

  • Wang, C., Blei, D., & Heckerman, D. (2008). Continuous time dynamic topic models. In Uncertainty in Artificial Intelligence. Helsinki.

    Google Scholar 

  • Wartena, C., & Brussee, R. (2008). Topic detection by clustering keywords. In 2008 19th International Workshop on Database and Expert Systems Applications. 54–58. IEEE

  • Wei, L., Jiamin, W., & Jiming, H. (2020). Analyzing the topic distribution and evolution of foreign relations from parliamentary debates: A framework and case study. Information Processing & Management, 57(3), 102191.

    Article  Google Scholar 

  • Whye Teh, Y., Jordan, M. I., Beal, M. J., & Blei, D. M. (2004). Sharing clusters among related groups: Hierarchical Dirichlet processes. In NIPS’04 Proceedings of the 17th International Conference on Neural Information Processing Systems. 1385–1392

  • Wu, H., Yi, H., & Li, C. (2021). An integrated approach for detecting and quantifying the topic evolutions of patent technology: A case study on graphene field. Scientometrics, 126(8), 6301–6321.

    Article  Google Scholar 

  • Wu, Q., Zhang, C., Hong, Q., & Chen, L. (2014). Topic evolution based on LDA and HMM and its application in stem cell research. Journal of Information Science, 40(5), 611–620.

    Article  Google Scholar 

  • Xu, H., Winnink, J., Yue, Z., Liu, Z., & Yuan, G. (2020). Topic-linked innovation paths in science and technology. Journal of Informetrics, 14(2), 101014.

    Article  Google Scholar 

  • Zhang, Y., Mao, W., & Lin, J. (1991). Modeling topic evolution in social media short texts. In 2017 IEEE International Conference on Big Knowledge (ICBK). 315–319. IEEE

  • Zhao, J., Wu, W., Zhang, X., Qiang, Y., Liu, T., & Wu, L. (2014). A short-term trend prediction model of topic over Sina Weibo dataset. Journal of Combinatorial Optimization, 28(3), 613–625.

    Article  MathSciNet  Google Scholar 

  • Zhou, H., Yu, H., & Hu, R. (2017). Topic evolution based on the probabilistic topic model: A review. Frontiers of Computer Science, 11(5), 786–802.

    Article  Google Scholar 

  • Zhu, J., Li, X., Peng, M., Huang, J., Qian, T., Huang, J., Liu, J., Hong, R., & Liu, P. (2015). Coherent topic hierarchy: A strategy for topic evolutionary analysis on microblog feeds. International Conference on Web-Age Information Management. Springer.

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China [grant numbers 71874088, 71704085]; the Cultivation Base of Excellent Innovation Team in Philosophy & Social Sciences in Jiangsu Universities [grant number 2017ZSTD022]; Postgraduate Research & Practice Innovation Program of Jiangsu Province [grant number KYCX20_0840].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Qian.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, H., Qian, L., Qin, W. et al. Evolution analysis of online topics based on ‘word-topic’ coupling network. Scientometrics 127, 3767–3792 (2022). https://doi.org/10.1007/s11192-022-04439-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-022-04439-x

Keywords