Abstract
This paper proposes a novel entitymetrics approach by exclusively focusing on citation sentences. Since citation sentences offer authors’ research interest, knowledge entities that appear in such sentences can be considered as key entities. To characterize such key entities, we focus on citation sentences that were extracted from full-text research articles collected from PubMed Central. We used “opioid” as our search query since it is an actively studied domain, which indicates that rigorous amounts of knowledge entities and entity pairs are available for examination. After which we construct two novel citation sentence-based networks, namely the Direct Citation Sentence (DCS) network and the Indirect Citation Sentence (ICS) network. The DCS network is built upon direct entity pairs that are captured within citation sentences. The ICS network, on the other hand, utilized indirect entity cooccurrences based on cited author information and section information. To do this, we propose a multi-anchor bipartite network that uses cited author information and section headings as a multi-anchor that is related to bio-entity nodes, namely the [author/section]-entity bipartite network. To demonstrate the usefulness of the DCS and ICS network, a conventional full-text network is formed for comparison analysis. In addition, during this process, MeSH tree structure is used to examine the bio-entity level characteristics. The results show that DCS and ICS network demonstrate distinct network characteristics and provide unobserved top-ranked bio-entity pairs when compared to traditional method. This indicates that our method can expand the base of entitymetrics and provide new insights for entity level bibliometrics analysis.



Similar content being viewed by others
Notes
https://www.springer.com/journal/11192/submission-guidelines#Instructions%20for%20Authors_Title%20Page, last accessed 15 January 2023.
https://www.nlm.nih.gov/mesh/intro_trees.html last accessed 15 January 2023.
MeSH tree structure is a forest rather (composed of 16 single category trees) than a single tree. To calculate the distance (branch count) between MeSH descriptors that were included in different categories, we assumed that the 16 root tree nodes were all connected with each other. We acknowledge that such approach can have limits since some root tree nodes are not closely related with one another (e.g., “Information Science” and “Diseases”). However, it was shown that all the MeSH terms that matched our top-20 bio-entity pair list were included in eight root tree nodes (i.e., “Anatomy”, “Organisms”, “Diseases”, “Chemicals and Drugs”, “Analytical, Diagnostic and Therapeutic Techniques, and Equipment”, “Psychiatry and Psychology”, “Phenomena and Processes”, “Health Care”) that were closely related to each other.
References
Abdelrahman, A. M. B., & Kayed, A. (2015). A survey on semantic similarity measures between concepts in health domain. American Journal of Computational Mathematics, 5(2), 204–214. https://doi.org/10.4236/ajcm.2015.52017
Amplayo, R. K. & Song, M. (2016). Building content-driven entity networks for scarce scientific literature using content information. In Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016), (pp. 20–29). https://aclanthology.org/W16-5103
An, J., Kim, N., Kan, M., Chandrasekaran, M. K., & Song, M. (2017). Exploring characteristics of highly cited authors according to citation location and content. Journal of the Association for Information Science and Technology, 68(8), 1975–1988. https://doi.org/10.1002/asi.23834
Blondel, V. D., Guillaume, J., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008. https://doi.org/10.1088/1742-5468/2008/10/P10008
Cheng, Q. K., Wang, J. M., Lu, W., Huang, Y., & Bu, Y. (2020). Keyword-citation-keyword network: A new perspective of discipline knowledge structure analysis. Scientometrics, 124(3), 1923–1943. https://doi.org/10.1007/s11192-020-03576-5
Compton, W. M., Jones, C. M., & Baldwin, G. T. (2016). Relationship between nonmedical prescription-opioid use and heroin use. New England Journal of Medicine, 374(2), 154–163. https://doi.org/10.1056/NEJMra1508490
Corral, Á., Boleda, G., & Ferrer-i-Cancho, R. (2015). Zipf’s law for word frequencies: word forms versus lemmas in long texts. PLoS ONE, 10(7), e0129031. https://doi.org/10.1371/journal.pone.0129031
Davis, A. P., Wiegers, T. C., Roberts, P. M., King, B. L., Lay, J. M., Lennon-Hopkins, K., Sciaky, D., Johnson, R., Keating, H., Greene, N., Hernandez, R., McConnell, K. J., Enayetallah, A. E., & Mattingly, C. J. (2013). A CTD-Pfizer collaboration: Manual curation of 88,000 scientific articles text mined for drug-disease and drug-phenotype interactions. Database: the Journal of Biological Databases and Curation. https://doi.org/10.1093/database/bat0804
Ding, Y., Song, M., Han, J., Yu, Q., Yan, E., Lin, L., & Chambers, T. (2013). Entitymetrics: Measuring the impact of entities. PLoS ONE, 8(8), e71416. https://doi.org/10.1371/journal.pone.0071416
Duck, G., Nenadic, G., Filannino, M., Brass, A., Robertson, D. L., & Stevens, R. (2016). A Survey of bioinformatics database and software usage through mining the literature. PLoS ONE, 11(6), e0157989. https://doi.org/10.1371/journal.pone.0157989
Enten, G., Shenouda, M. A., Samuels, D., Fowler, N., Balouch, M., & Camporesi, E. (2019). A retrospective analysis of the safety and efficacy of opioid-free anesthesia versus opioid anesthesia for general cesarean section. Cureus, 11(9), e5725. https://doi.org/10.7759/cureus.5725
Fields, H. L. (2011). The doctor’s dilemma: Opiate analgesics and chronic pain. Neuron, 69(4), 591–594. https://doi.org/10.1016/j.neuron.2011.02.001
Flemming, K. (2010). The use of morphine to treat cancer-related pain: A synthesis of quantitative and qualitative research. Journal of Pain and Symptom Management, 39(1), 139–154. https://doi.org/10.1016/j.jpainsymman.2009.05.014
Gomes, T., Juurlink, D. N., Antoniou, T., Mamdani, M. M., Paterson, J. M., & van den Brink, W. (2017). Tolerance, opioid-induced allodynia and withdrawal associated allodynia in infant and young rats. PLoS Medicine. https://doi.org/10.1371/journal.pmed.1002396
Hu, Z., Chen, C., & Liu, Z. (2013). Where are citations located in the body of scientific articles? A study of distributions of citation locations. Journal of Informetrics, 7(4), 887–896. https://doi.org/10.1016/j.joi.2013.08.005
Ibrahim, B. (2021). Statistical methods used in Arabic journals of library and information science. Scientometrics, 126(5), 4383–4416. https://doi.org/10.1007/s11192-021-03913-2
Jensen, T. S., & Finnerup, N. B. (2014). Allodynia and hyperalgesia in neuropathic pain: Clinical manifestations and mechanisms. The Lancet Neurology, 13(9), 924–935. https://doi.org/10.1016/S1474-4422(14)70102-4
Jeong, Y. K., Song, M., & Ding, Y. (2014). Content-based author co-citation analysis. Journal of Informetrics, 8(1), 197–211. https://doi.org/10.1016/j.joi.2013.12.001
Jeong, Y. K., Xie, Q., Yan, E., & Song, M. (2020). Examining drug and side effect relation using author–entity pair bipartite networks. Journal of Informetrics, 14(1), 100999. https://doi.org/10.1016/j.joi.2019.100999
Kim, H. J., An, J., Jeong, Y. K., & Song, M. (2016a). Exploring the leading authors and journals in major topics by citation sentences and topic modeling. Proceedings of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL), (pp. 42–50). https://aclanthology.org/W16-1506
Kim, H. J., Jeong, Y. K., & Song, M. (2016b). Content- and proximity-based author co-citation analysis using citation sentences. Journal of Informetrics, 10(4), 954–966. https://doi.org/10.1016/j.joi.2016.07.007
Kolodny, A., Courtwright, D. T., Hwang, C. S., Kreiner, P., Eadie, J. L., Clark, T. W., & Alexander, G. C. (2015). The prescription opioid and heroin crisis: A public health approach to an epidemic of addiction. Annual Review of Public Health, 36, 559–574. https://doi.org/10.1146/annurev-publhealth-031914-122957
Lee, M., Silverman, S., Hansen, H., Patel, V., & Manchikanti, L. (2011). A comprehensive review of opioid-induced hyperalgesia. Pain Physician, 14(2), 145–161.
Li, X., Ding, Y., & Lu, W. (2020a). Using entity metrics to understand drug repurposing. AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science, 2020, 377–382.
Li, X., Rousseau, J. F., Ding, Y., Song, M., & Lu, W. (2020b). Understanding drug repurposing from the perspective of biomedical entities and their evolution: Bibliographic research using aspirin. MIR Medical Informatics, 8(6), e16739. https://doi.org/10.2196/16739
Lu, W., Huang, Y., Bu, Y., & Cheng, Q. (2018). Functional structure identification of scientific documents in computer science. Scientometrics, 115(1), 463–486. https://doi.org/10.1007/s11192-018-2640-y
Lv, Y., Ding, Y., Song, M., & Duan, Z. (2018). Topology-driven trend analysis for drug discovery. Journal of Informetrics, 12(3), 893–905. https://doi.org/10.1016/j.joi.2018.07.007
Manandhar, P., Murnion, B. P., Grimsey, N. L., Connor, M., & Santiago, M. (2021). Do gabapentin or pregabalin directly modulate the µ receptor? PeerJ, 9, e11175. https://doi.org/10.7717/peerj.11175
Mao, G., & Zhang, N. (2013). Analysis of average shortest-path length of scale-free network. Journal of Applied Mathematics. https://doi.org/10.1155/2013/865643
McNamara, S., Stokes, S., Kilduff, R., & Shine, A. (2015). Pregabalin abuse amongst opioid substitution treatment patients. Irish Medical Journal, 108(10), 309–310.
Merrer, Le., Becker, J. A. J., Befort, K., & Kieffer, B. L. (2009). Reward processing by the opioid system in the brain. Physiological Reviews, 89(4), 1379–1412. https://doi.org/10.1152/physrev.00005.2009
Milojević, S. (2010). Power law distributions in information science: Making the case for logarithmic binning. Journal of the American Society for Information Science and Technology, 61(12), 2417–2425. https://doi.org/10.1371/journal.pone.0129031
Morrison, E., Sandilands, E. A., & Webb, D. J. (2017). Gabapentin and pregabalin: Do the benefits outweigh the harms? The Journal of the Royal College of Physicians of Edinburgh, 47(4), 310–313. https://doi.org/10.4997/JRCPE.2017.402
Nam, D., Kim, J., Yoon, J., Song, C., Kim, S., & Song, M. (2022). Characterizing Knowledge Entity Extracted from Citation Sentences. Proceeding of 3rd Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents 2022 (EEKE 2022), Germany and Online, 23–24 June, 2022. https://ceur-ws.org/Vol-3210/paper10.pdf
Newman, M. E. J. (2004a). Analysis of weighted networks. Physical Review E, 70(5), 056131. https://doi.org/10.1103/PhysRevE.70.056131
Newman, M. E. J. (2004b). Fast algorithm for detecting community structure in networks. Physical Review E, 69(6), 066133. https://doi.org/10.1103/PhysRevE.69.066133
Nummenmaa, L., Saanijoki, T., Tuominen, L., Hirvonen, J., Tuulari, J. J., Nuutila, P., & Kalliokoski, K. (2018). μ-opioid receptor system mediates reward processing in humans. Nature Communications, 9(1), 1–7. https://doi.org/10.1038/s41467-018-03848-y
Pan, X., Yan, E., Cui, M., & Hua, W. (2018). Examining the usage, citation, and diffusion patterns of bibliometric mapping software: A comparative study of three tools. Journal of Informetrics, 12(2), 481–493. https://doi.org/10.1016/j.joi.2018.03.005
Park, N., Ryu, H., Ding, Y., Yu, Q., Bu, Y., Wang, Q., Yang, J. J., Song, M. (2021). Are we there yet? Analyzing scientific research related to COVID-19 drug repurposing. 18th International Conference on Scientometrics & Informetrics (ISSI2021) (pp. 883–894).
Piantadosi, S. T. (2014). Zipf’s word frequency law in natural language: A critical review and future directions. Psychonomic Bulletin & Review, 21(5), 1112–1130. https://doi.org/10.3758/s13423-014-0585-6
Song, M., Baek, S. H., Heo, G. E., & Lee, J. H. (2019). Inferring drug-protein-side effect relationships from biomedical Text. Genes, 10(2), 159. https://doi.org/10.3390/genes10020159
Song, M., Han, N., Kim, Y., Ding, Y., & Chambers, T. (2013). Discovering implicit entity relation with the gene-citation-gene network. PLoS ONE, 8(12), e84639. https://doi.org/10.1371/journal.pone.0084639
Song, M., Kang, K., & An, J. Y. (2018). Investigating drug–disease interactions in drug–symptom–disease triples via citation relations. Journal of the Association for Information Science and Technology, 69(11), 1355–1368. https://doi.org/10.1002/asi.24060
Song, M., & Kim, S. Y. (2013). Detecting the knowledge structure of bioinformatics by mining full-text collections. Scientometrics, 96(1), 183–201. https://doi.org/10.1007/s11192-012-0900-9
Song, M., Kim, W. C., Lee, D., Heo, G. E., & Kang, K. Y. (2015). PKDE4J: Entity and relation extraction for public knowledge discovery. Journal of Biomedical Informatics, 57, 320–332. https://doi.org/10.1016/j.jbi.2015.08.008
Sweileh, W. M., Shraim, N. Y., Zyoud, S. H., & Al-Jabi, S. B. (2016). Worldwide research productivity on tramadol: A bibliometric analysis. Springerplus. https://doi.org/10.1186/s40064-016-2801-5
Tan, F., Zhang, T., Yang, S., Wu, X., & Xu, J. (2021). Discovering booming bio-entities and their relationship with funds. Data and Information Management, 5(3), 312–328. https://doi.org/10.2478/dim-2021-0007
Vicente, J., Thanki, D., Škařupová, K., European Monitoring Centre for Drugs and Drug Addiction, Vicente. (2014). The levels of use of opioids, amphetamines and cocaine and associated levels of harm. In J. Vicente & D. Thanki (Eds.), Summary of scientific evidence. Publications Office.
Volkow, N. D., Jones, E. B., Einstein, E. B., & Wargo, E. M. (2019). Prevention and treatment of opioid misuse and addiction: A review. JAMA Psychiatry, 76(2), 208–216. https://doi.org/10.1001/jamapsychiatry.2018.3126
Wang, S., Mao, J., Cao, Y., & Li, G. (2022a). Integrated knowledge content in an interdisciplinary field: Identification, classification, and application. Scientometrics, 127(11), 6581–6614. https://doi.org/10.1007/s11192-022-04282-0
Wang, Y., & Zhang, C. (2018). Using full-text of research articles to analyze academic impact of algorithms. In G. Chowdhury, J. McLeod, V. Gillet, & P. Willett (Eds.), transforming digital worlds (pp. 395–4014). Springer. https://doi.org/10.1007/978-3-319-78105-1_43
Wang, Y., & Zhang, C. (2020). Using the full-text content of academic articles to identify and evaluate algorithm entities in the domain of natural language processing. Journal of Informetrics, 14(4), 101091. https://doi.org/10.1016/j.joi.2020.101091
Wang, Y., Zhang, C., & Li, K. (2022b). A review on method entities in the academic literature: Extraction, evaluation, and application. Scientometrics, 127(5), 1–42. https://doi.org/10.1007/s11192-022-04332-7
Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applications (1st ed.). Cambridge University Press.
Yan, X., Li, X., & Song, D. (2006). Document generality: Its computation for ranking. Australian Computer Science Communications, 28(2), 109–118.
Yu, Q., Wang, Q., Zhang, Y. F., Chen, C. Y., Ryu, H., Park, N., & Bu, Y. (2021). Analyzing knowledge entities about COVID-19 using entitymetrics. Scientometrics, 126(5), 4491–4509. https://doi.org/10.1007/s11192-021-03933-y
Zhang, C., Mayr, P., Lu, W., & Zhang, Y. (2023). Guest editorial: Extraction and evaluation of knowledge entities in the age of artificial intelligence. Aslib Journal of Information Management, 75(3), 433–437. https://doi.org/10.1108/AJIM-05-2023-507
Zhao, M., Yan, E., & Li, K. (2017). Data set mentions and citations: A content analysis of full-text publications. Journal of the Association for Information Science and Technology, 69(1), 32–46. https://doi.org/10.1002/asi.23919
Zhu, Y., Song, M., & Yan, E. (2016). Identifying liver cancer and its relations with diseases, drugs, and genes: A literature-based approach. PLoS ONE, 11(5), e0156091. https://doi.org/10.1371/journal.pone.0156091
Zissen, M. H., Zhang, G., McKelvy, A., Propst, J. T., Kendig, J. J., & Sweitzer, S. M. (2007). Tolerance, opioid-induced allodynia and withdrawal associated allodynia in infant and young rats. Neuroscience, 144(1), 247–262. https://doi.org/10.1016/j.neuroscience.2006.08.078
Acknowledgements
This paper was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (No. 2022R1A2B5B02002359). The earlier version of this paper (Nam et al., 2022) was accepted at the 3rd Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE2022) at the ACM/IEEE Joint Conference on Digital Libraries 2022 (JCDL2022), Cologne, Germany and Online.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The corresponding author (Min Song) is a member of the Distinguished Reviewers Board of Scientometrics.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Top-50 section heading based on frequencies
Section heading | Freq. | % | Section heading | Freq. | % |
---|---|---|---|---|---|
Discussion | 550,001 | 22.36 | Introduction and background | 1114 | 0.05 |
Introduction | 420,851 | 17.15 | Limitations | 1013 | 0.04 |
Results | 141,030 | 5.75 | Diagnosis | 963 | 0.04 |
Methods | 113,141 | 4.43 | Concluding remarks | 855 | 0.03 |
Materials and methods | 101,053 | 3.75 | Main body | 850 | 0.03 |
Background | 79,623 | 3.24 | Mechanism of action | 834 | 0.03 |
Results and discussion | 31,899 | 1.30 | Pharmacological activities | 823 | 0.03 |
Conclusions | 12,597 | 0.52 | Mapping and ablation | 821 | 0.03 |
Review | 11,080 | 0.45 | Recommendations | 753 | 0.03 |
Methodsdesign | 4814 | 0.20 | Subjects and methods | 698 | 0.03 |
Discussion and conclusions | 3613 | 0.14 | Pathogenesis | 689 | 0.03 |
Main text | 3137 | 0.13 | Rationale | 675 | 0.03 |
Patients and methods | 3091 | 0.13 | Clinical studies | 674 | 0.03 |
Treatment | 3021 | 0.12 | General discussion | 665 | 0.03 |
Methods and analysis | 2643 | 0.11 | Starethods | 658 | 0.03 |
Methodology | 2323 | 0.09 | Pharmacokinetics | 638 | 0.03 |
Management | 1785 | 0.07 | Pharmacology | 636 | 0.03 |
Experimental section | 1780 | 0.07 | Case presentation | 631 | 0.03 |
Pathophysiology | 1558 | 0.06 | Online methods | 599 | 0.02 |
Findings | 1514 | 0.06 | Serotonin | 556 | 0.02 |
Epidemiology | 1420 | 0.06 | Pharmacological activity | 529 | 0.02 |
Literature review | 1338 | 0.05 | Opioids | 528 | 0.02 |
Future directions | 1282 | 0.05 | Overview | 522 | 0.02 |
Experimental procedures | 1210 | 0.05 | Conclusions and future directions | 517 | 0.02 |
Experimental | 1198 | 0.05 | Context | 511 | 0.02 |
Appendix 2: Excluded entity list
Entity | Freq. | Entity | Freq. |
---|---|---|---|
Treatment | 526,686 | Lead | 57,961 |
Pain | 456,305 | Oral | 57,447 |
Drug | 211,165 | Procedure | 57,276 |
FIG | 206,186 | Procedures | 56,578 |
Care | 201,449 | Affect | 56,073 |
Response | 174,147 | Neuronal | 54,537 |
Drugs | 168,170 | Severity | 54,409 |
Surgery | 161,686 | Protocol | 54,118 |
Brain | 157,926 | Condition | 54,087 |
Dose | 137,096 | Like | 53,988 |
Reduced | 127,899 | Key | 53,656 |
Function | 127,052 | Medications | 52,639 |
Therapy | 117,437 | End | 51,958 |
Human | 117,366 | Sensitivity | 51,654 |
Blood | 106,818 | Interest | 50,794 |
Disease | 106,162 | Secondary | 49,189 |
Opioids | 101,146 | Rat | 48,710 |
Symptoms | 96,081 | Distribution | 48,064 |
Support | 89,745 | Strategies | 46,310 |
Chronic | 82,604 | Adult | 44,728 |
Stimulation | 82,076 | Disorders | 43,925 |
Severe | 80,434 | Delivery | 43,609 |
Exposure | 80,109 | Line | 42,817 |
Impact | 77,239 | Side effects | 42,242 |
General | 76,767 | Right | 41,062 |
Expressed | 72,793 | Injury | 41,590 |
Acute | 71,777 | Understanding | 40,342 |
Normal | 71,392 | Moderate | 39,323 |
Management | 68,786 | Focus | 37,051 |
Medication | 67,976 | Diseases | 36,224 |
Measures | 67,668 | Light | 35,968 |
Association | 67,400 | Onset | 35,871 |
Concentrations | 64,023 | Finding | 35,413 |
Central | 63,070 | Strategy | 35,201 |
Food | 62,410 | Nervous | 34,967 |
Block | 62,157 | Activated | 34,227 |
Set | 61,710 | Content | 33,724 |
Intensity | 59,397 |
Appendix 3: Visualization map of conventional full-text cooccurrence network

Appendix 4: Visualization map of DCS cooccurrence network

Appendix 5: Visualization map of ICS cooccurrence network

Appendix 6: Conventional full-text network cooccurrence information
Entity 1 | Entity 2 | Freq | Distance |
---|---|---|---|
Anesthesia | Propofol | 5902 | 10 |
Methadone | Buprenorphine | 5708 | 6.75 |
Morphine | Fentanyl | 5326 | 7.25 |
Mental health | Substance use disorder | 5240 | 5 |
Substance use disorder | Addiction | 4292 | 4 |
Anterior | Posterior | 3697 | – |
Morphine | Oxycodone | 3463 | 8.375 |
Naloxone | Overdose | 3447 | 10.75 |
Heroin | Cocaine | 3410 | 8 |
Kit | RNA | 3329 | – |
Glucose | Insulin | 3225 | 11 |
Hip | Fracture | 3199 | – |
Propofol | Remifentanil | 3182 | 9.5 |
Anesthesia | Isoflurane | 3152 | 7 |
Postoperative pain | Pain management | 3116 | 8.5 |
Anesthesia | Sevoflurane | 3093 | 7.5 |
Propofol | Sedation | 3081 | 10 |
Hypotension | Bradycardia | 3057 | 5 |
CBD | THC | 3040 | 1 |
Withdrawal | Morphine | 3035 | 9.75 |
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nam, D., Kim, J., Yoon, J. et al. Examining knowledge entities and its relationships based on citation sentences using a multi-anchor bipartite network. Scientometrics 129, 7197–7228 (2024). https://doi.org/10.1007/s11192-023-04824-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-023-04824-0