Abstract
Topic extraction for books is of great significance in the development of intelligent reading systems, question answering systems and other applications. Compared with the theme of microblog and science and technology literature, the topic of book has the characteristics of multi-themes, hierarchization, networking, and information sharing. Therefore, the topic extraction of books must be more complicated and difficult. This article is based on solving the problems such as quick positioning of the relevant contents of the answer, cross-topic retrieval, and other issues in the intelligent reading system. Based on the topic trees extracted from the novel text chapters using the TF-IDF algorithm, the FP-GROWTH algorithm is used to mine the topic words. The association relationship, in turn, analyzes the hidden relationship between topics, and proposes and constructs an implicit hierarchical subject network (IHTN) of the novel text. The experimental results show that this method can completely extract the thematic network of novel texts, effectively extract the chapter relationships, significantly reduce the answer retrieval time in the question answering system, and improve the accuracy of the answers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Xue, X., Gao, J., et al.: Research on topic extraction algorithm based on MapReduce parallel LDA model. J. FuZhou Univ. (Nat. Sci. Ed.) 44(5), 644–648 (2016)
Hu, J., Chen, G.: Mining and evolution of content topic based on dynamic LDA. Libr. Inf. Serv. 58(2), 138–142 (2014)
Van Eck, N.J., Waltman, L.: Citation-based clustering of publications using CitNetExplorer and VOSviewer. In: Gläser, J., Scharnhorst, A., Glänzel, W. (eds.) Same Data – Different Results? Towards a Comparative Approach to the Identification of Thematic Structures in Science. Special Issue of Scientometrics (2017). https://doi.org/10.1007/s11192-017-2300-7
Velden, T., Boyack, K.W., Gläser, J., Koopman, R., Scharnhorst, A., Wang, S.: Comparison of topic extraction approaches and their results. In: Gläser, J., Scharnhorst, A., Glänzel, W. (eds.) Same Data—Different Results? Towards a Comparative Approach to the Identification of Thematic Structures in Science. Special issue of Scientometrics (2017)
Havemann, F., Gläser, J., Heinz, M.: Memetic search for overlapping topics based on a local evaluation of link communities. In: Gläser, J., Scharnhorst, A. Glänzel, W. (eds.) Same Data – Different Results? Towards a Comparative Approach to the Identification of Thematic Structures in Science. Special Issue of Scientometrics (2017). https://doi.org/10.1007/s11192-017-2302-5
Koopman, R., Wang, S.: Mutual information based labelling and comparing clusters. In: Gläser, J., Scharnhorst, A. Glänzel, W. (eds.) Same Data Different Results? Towards a Comparative Approach to the Identification of Thematic Structures in Science. Special Issue of Scientometrics (2017b). https://doi.org/10.1007/s1192-017-2305-x
Jing, C.L.Z., et al.: Application of hierarchical topic model on technological evolution analysis. Libr. Inf. Serv. 61(5), 103–108 (2017)
Wu, X.J., Zheng, F., Xu, M.-X.: Topic forest based dialog management model. ACTA Autom. Sin. 29(2), 275–283 (2003)
Erra, U., Senatore, S., Minnella, F., Caggianese, G.: Approximate TF-IDF based on topic extraction from massive message stream using the GPU. Inf. Sci. 292, 143–161 (2015)
Haddi, E., Liu, X., Shi, Y.: The role of text pre-processing in sentiment analysis. Procedia Comput. Sci. 17, 26–32 (2013)
Trstenjak, B., Mikac, S., Donko, D.: KNN with TF-IDF based framework for text categorization. Procedia Eng. 69, 1356–1364 (2014)
Gimpel, K., et al.: Part-of-speech tagging for Twitter: annotation, features, and experiments. Carnegie-Mellon Univ Pittsburgh Pa School of Computer Science (2010)
Rill, S., Reinel, D., Scheidt, J., Zicari, R.V.: PoliTwi: early detection of emerging political topics on Twitter and the impact on concept-level sentiment analysis. Knowl.-Based Syst. 69, 24–33 (2014)
Xiong, Z., Shen, Q., Wang, Y., Zhu, C.: Paragraph vector representation based on word to vector and CNN learning. CMC: Comput. Mater. Continua 055(2), 213–227 (2018)
Wang, M., Wang, J., Guo, L., Harn, L.: Inverted XML access control model based on ontology semantic dependency. CMC: Comput. Mater. Continua 55(3), 465–482 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yu, W., Yi, M., Li, Z. (2019). Research on Constructing Technology of Implicit Hierarchical Topic Network Based on FP-Growth. In: Sun, X., Pan, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2019. Lecture Notes in Computer Science(), vol 11632. Springer, Cham. https://doi.org/10.1007/978-3-030-24274-9_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-24274-9_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24273-2
Online ISBN: 978-3-030-24274-9
eBook Packages: Computer ScienceComputer Science (R0)