To read this content please select one of the options below:

Citation context-based topic models: discovering cited and citing topics from full text

Lixue Zou (National Science Library, Chinese Academy of Sciences, Beijing, China) (University of Chinese Academy of Sciences, Beijing, China)
Xiwen Liu (National Science Library, Chinese Academy of Sciences, Beijing, China) (University of Chinese Academy of Sciences, Beijing, China)
Wray Buntine (Faculty of Information Technology, Monash University, Melbourne, Australia)
Yanli Liu (National Science Library, Chinese Academy of Sciences, Beijing, China) (University of Chinese Academy of Sciences, Beijing, China)

Library Hi Tech

ISSN: 0737-8831

Article publication date: 4 June 2021

Issue publication date: 30 November 2021

372

Abstract

Purpose

Full text of a document is a rich source of information that can be used to provide meaningful topics. The purpose of this paper is to demonstrate how to use citation context (CC) in the full text to identify the cited topics and citing topics efficiently and effectively by employing automatic text analysis algorithms.

Design/methodology/approach

The authors present two novel topic models, Citation-Context-LDA (CC-LDA) and Citation-Context-Reference-LDA (CCRef-LDA). CC is leveraged to extract the citing text from the full text, which makes it possible to discover topics with accuracy. CC-LDA incorporates CC, citing text, and their latent relationship, while CCRef-LDA incorporates CC, citing text, their latent relationship and reference information in CC. Collapsed Gibbs sampling is used to achieve an approximate estimation. The capacity of CC-LDA to simultaneously learn cited topics and citing topics together with their links is investigated. Moreover, a topic influence measure method based on CC-LDA is proposed and applied to create links between the two-level topics. In addition, the capacity of CCRef-LDA to discover topic influential references is also investigated.

Findings

The results indicate CC-LDA and CCRef-LDA achieve improved or comparable performance in terms of both perplexity and symmetric Kullback–Leibler (sKL) divergence. Moreover, CC-LDA is effective in discovering the cited topics and citing topics with topic influence, and CCRef-LDA is able to find the cited topic influential references.

Originality/value

The automatic method provides novel knowledge for cited topics and citing topics discovery. Topic influence learnt by our model can link two-level topics and create a semantic topic network. The method can also use topic specificity as a feature to rank references.

Keywords

Citation

Zou, L., Liu, X., Buntine, W. and Liu, Y. (2021), "Citation context-based topic models: discovering cited and citing topics from full text", Library Hi Tech, Vol. 39 No. 4, pp. 1063-1083. https://doi.org/10.1108/LHT-01-2021-0041

Publisher

:

Emerald Publishing Limited

Copyright © 2021, Emerald Publishing Limited

Related articles