Skip to main content

Topic Evolution Analysis Based on Optimized Combined Topic Model: Illustrated as CRISPR Technology

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13972))

Abstract

Identifying the evolution trend of advanced technology-related topics has become an essential strategic issue affecting the industrial development of all countries in the world. In this paper, based on multiple data sources, we proposed a research framework that integrates the topic model and social network perspective to analyze the topic evolution of a specific technology field. First, we introduced the best-performing BERT pre-trained model in the given field and the Bayesian Optimization method to improve the Combined Topic Model, which achieved the best result in promoting topic coherence so far. Then we used the Optimized Combined Topic Model (OCTM) to complete topic recognition. Second, we constructed the co-occurrence network among topics in the same time window with the topics as the nodes and calculated the co-occurrence coefficient of all topic pairs. Afterward, we combined the co-occurrence coefficient between topics in the same time window and the similarity between topics in the adjacent time window to determine the topic evolution type and identify the path. Third, we utilized the characteristics of the nodes in the network, such as harmonic closeness centrality and weighting degree, completed the weighting by the Criteria Importance Though Intercriteria Correlation (CRITIC) method, and defined the importance index of each node in the undirected weighted network. Finally, according to the importance of nodes, the critical topic evolution paths were selected for specific analysis. We chose CRISPR technology as the empirical research field to preliminarily verify the operability and rationality of the method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bianchi, F., Terragni, S., Hovy, D.: Pre-training is a hot topic: contextualized document embeddings improve topic coherence. arXiv Preprint, arXiv:2004.03974 (2020)

  2. Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthc. Health. 3, 1–23 (2021)

    Google Scholar 

  3. Hui, L., Jixia, H., Zhiying, T.: Subject topic mining and evolution analysis with multi-source data. Data Anal. Knowl. Discov. 6, 44–55 (2022)

    Google Scholar 

  4. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  5. Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25, 259–284 (1998)

    Article  Google Scholar 

  6. Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)

    Article  MATH  Google Scholar 

  7. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26 (2013)

    Google Scholar 

  8. Wang, Z., Ma, L., Zhang, Y.: A hybrid document feature extraction method using latent Dirichlet allocation and Word2Vec. In: 2016 IEEE First International Conference on Data Science in Cyberspace (DSC), pp. 98–103. IEEE (2016)

    Google Scholar 

  9. Kim, S., Park, H., Lee, J.: Word2Vec-based latent semantic analysis (W2V-LSA) for topic modeling: a study on blockchain technology trend analysis. Expert Syst. Appl. 152, 113401 (2020). https://doi.org/10.1016/j.eswa.2020.113401

    Article  Google Scholar 

  10. Hofmann, T.: Probabilistic latent semantic analysis. arXiv Preprint, arXiv:1301.6705 (2013)

  11. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv Preprint, arXiv:1810.04805 (2018)

  12. Cheng, Q., et al.: Bert-based latent semantic analysis (Bert-LSA): a case study on geospatial data technology and application trend analysis. Appl. Sci. 11, 11897 (2021). https://doi.org/10.3390/app112411897

    Article  Google Scholar 

  13. Srivastava, A., Sutton, C.: Autoencoding variational inference for topic models (2017). http://arxiv.org/abs/1703.01488

  14. Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems 25 (2012)

    Google Scholar 

  15. Liu, J., Long, Z., Wang, F.: Finding collaboration opportunities from emerging issues with LDA topic model and link prediction. Data Anal. Knowl. Discov. 3, 104–117 (2019)

    Google Scholar 

  16. Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, pp. 399–408 (2015)

    Google Scholar 

  17. Rehurek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. Citeseer (2010)

    Google Scholar 

  18. Dieng, A.B., Ruiz, F.J.R., Blei, D.M.: Topic modeling in embedding spaces. Trans. Assoc. Comput. Linguist. 8, 439–453 (2020). https://doi.org/10.1162/tacl_a_00325

    Article  Google Scholar 

  19. Palla, G., Barabási, A.-L., Vicsek, T.: Quantifying social group evolution. Nature 446, 664–667 (2007)

    Article  Google Scholar 

  20. Diakoulaki, D., Mavrotas, G., Papayannakis, L.: Determining objective weights in multiple criteria problems: the critic method. Comput. Oper. Res. 22, 763–770 (1995)

    Article  MATH  Google Scholar 

  21. Zhu, G., Pan, G., Li, F.: The topic evolution of information privacy from the perspective of temporal correlation and structural representation. Inf. Sci. 40, 127–137 (2022). https://doi.org/10.13833/j.issn.1007-7634.2022.04.016

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Y., Xu, S., Yang, Y., Huang, Y. (2023). Topic Evolution Analysis Based on Optimized Combined Topic Model: Illustrated as CRISPR Technology. In: Sserwanga, I., et al. Information for a Better World: Normality, Virtuality, Physicality, Inclusivity. iConference 2023. Lecture Notes in Computer Science, vol 13972. Springer, Cham. https://doi.org/10.1007/978-3-031-28032-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-28032-0_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-28031-3

  • Online ISBN: 978-3-031-28032-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics