Skip to main content
Log in

A new bibliographic coupling measure with descriptive capability

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Bibliographic coupling (BC) is an effective measure to estimate the similarity between two scholarly articles (i.e., inter-article similarity between the two articles). It works on out-link references of articles (i.e., those references cited by the articles), and is essential for relatedness analysis and topic clustering of scholarly articles. In this paper, we present a new BC measure DescriptiveBC, which employs the titles of the out-link references to improve BC in two ways: given a target article a, DescriptiveBC provides more accurate information about how (based on numerical inter-article similarity) and why (based on textual descriptive terms) a scholarly article is related to a. Visualization of the information can support the identification, clustering, mapping, and navigation of the related evidence in scientific literature. Empirical evaluation justifies the contributions of DescriptiveBC. Release of the reference titles in each article is thus helpful for the dissemination of research findings in scientific literature, and DescriptiveBC can be incorporated into search engines of scholarly articles to help prospective researchers to navigate through the space of related articles online.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Basic description of the “KeyWords Plus” service can be found at http://interest.science.thomsonreuters.com/content/WOKUserTips-201010-IN.

  2. DisGeNET is available at http://www.disgenet.org/web/DisGeNET/menu/home.

  3. The ways of database update by Genetic Home Reference and Online Mendelian Inheritance in Human can be found at http://ghr.nlm.nih.gov/ExpertReviewers and http://www.omim.org/about, respectively.

  4. GAD is available at available at http://geneticassociationdb.nih.gov.

  5. CTD is available at available at http://ctdbase.org.

  6. PubMed Central is available at http://www.ncbi.nlm.nih.gov/pmc.

  7. The title of PMC1774044 is “Absence of PRSS1 mutations and association of SPINK1 trypsin inhibitor mutations in hereditary and non-hereditary chronic pancreatitis”.

  8. The title of PMC1773194 is “The N34S mutation of SPINK1 (PSTI) is associated with a familial pattern of idiopathic chronic pancreatitis but does not cause the disease”.

  9. The title of PMC1773221 is “Mutations in serine protease inhibitor Kazal type 1 are strongly associated with chronic pancreatitis”.

  10. The title of PMC2928535 is “Inhibition of acinar apoptosis occurs during acute pancreatitis in the human homologue ∆F508 cystic fibrosis mouse”.

  11. A basic description for cationic trypsinogen and serine peptidase can be found at Genetic Home Reference: https://ghr.nlm.nih.gov/gene/PRSS1.

  12. A basic description for the CFTR gene and cystic fibrosis can be found at Genetic Home Reference: https://ghr.nlm.nih.gov/condition/cystic-fibrosis#genes.

  13. A basic description for the erythropoietin (EPO) gene can be found at Genetic Home Reference: https://ghr.nlm.nih.gov/gene/EPO#.

  14. The title of PMC3441831 is “Erythropoietin Receptor Contributes to Melanoma Cell Survival in vivo”.

  15. The title of PMC1386105 is “Signals for stress erythropoiesis are integrated via an erythropoietin receptor–phosphotyrosine-343–Stat5 axis”.

  16. The title of PMC2754516 is “Use of agents stimulating erythropoiesis in digestive diseases”.

  17. The title of PMC1890992 is “Erythropoietin/erythropoietin receptor system is involved in angiogenesis in human neuroblastoma”.

  18. Epoetin alfa is human erythropoietin produced in cell culture.

References

  • Aljaber, B., Stokes, N., Bailey, J., & Pei, J. (2010). Document clustering of scientific texts using citation contexts. Information Retrieval, 13(2), 101–131.

    Article  Google Scholar 

  • Becker, K. G., Barnes, K. C., Bright, T. J., & Wang, S. A. (2004). The genetic association database. Nature Genetics, 36(5), 431–432.

    Article  Google Scholar 

  • Boyack, K. W., & Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? Journal of the American Society for Information Science and Technology, 61(12), 2389–2404.

    Article  Google Scholar 

  • Boyack, K. W., Newman, D., Duhon, R. J., Klavans, R., Patek, M., Biberstine, J. R., et al. (2011). Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches. PLoS ONE, 6(3), e18029.

    Article  Google Scholar 

  • Boyack, K. W., Small, H., & Klavans, R. (2013). Improving the accuracy of co-citation clustering using full text. Journal of the American Society for Information Science and Technology, 64(9), 1759–1767.

    Article  Google Scholar 

  • Calado, P., Cristo, M., Moura, E., Ziviani, N., Ribeiro-Neto, B., & Goncalves, M. A. (2003). Combining link-based and content-based methods for web document classification. In Proceedings of the 2003 ACM CIKM international conference on information and knowledge management (CIKM’03), New Orleans, Louisiana, USA.

  • Couto, T., Cristo, M., Gonçalves, M. A., Calado, P., Nivio Ziviani, N., Moura, E., et al. (2006). A comparative study of citations and links in document classification. In Proceedings of the 6th ACM/IEEE-CS joint conference on digital libraries (pp. 75–84).

  • Elkiss, A., Shen, S., Fader, A., Erkan, G., States, D., & Radev, D. (2008). Blind men and elephants: What do citation summaries tell us about a research article? Journal of the American Society for Information Science and Technology, 59(1), 51–62.

    Article  Google Scholar 

  • Garfield, E. (1990). KeyWords Plus: ISI’s breakthrough retrieval method. Part 1. Expanding your searching power on current contents on diskette. Current Contents, 32, 3–7.

    Google Scholar 

  • Gipp, B., & Beel, J. (2009). Citation proximity analysis (CPA)—A new approach for identifying related work based on co-citation analysis. In Proceedings of the 12th international conference on scientometrics and informetrics (pp. 571–575), Brazil.

  • Gipp, B., & Meuschke, N. (2011). Citation pattern matching algorithms for citation-based plagiarism detection: greedy citation tiling, citation chunking and longest common citation sequence. In Proceedings of the 11th ACM symposium on document engineering, Mountain View, CA, USA.

  • Glenisson, P., Glanzel, W., Janssens, F., & De Moor, B. (2005). Combining full text and bibliometric information in mapping scientific disciplines. Information Processing and Management, 41, 1548–1572.

    Article  Google Scholar 

  • Janssens, F., Glänzel, W., & De Moor, B. (2008). A hybrid mapping of information science. Scientometrics, 75(3), 607–631.

    Article  Google Scholar 

  • Janssens, F., Zhang, L., De Moor, B., & Glänzel, W. (2009). Hybrid clustering for validation and improvement of subject-classification schemes. Information Processing and Management, 45, 683–702.

    Article  Google Scholar 

  • Kessler, M. M. (1963). Bibliographic coupling between scientific papers. American Documentation, 14(1), 10–25.

    Article  Google Scholar 

  • Kumar, S., Reddy, P. K., Reddy, V. B., & Singh, A. (2011). Similarity analysis of legal judgments. In Proceedings of the fourth annual ACM Bangalore conference (COMPUTE 2011), Bangalore, Karnataka, India.

  • Landauer, T. K., Laham, D., & Derr, M. (2004). From paragraph to graph: Latent semantic analysis for information visualization. Proceedings of the National Academy of Sciences of the USA, 101(Suppl 1), 5214–5219.

    Article  Google Scholar 

  • Liu, R.-L. (2015). Passage-based bibliographic coupling: An inter-article similarity measure for biomedical articles. PLoS ONE, 10(10), e0139245.

    Article  Google Scholar 

  • Liu, S., Chen, C., Ding, K., Wang, B., Xu, K., & Lin, Y. (2014). Literature retrieval based on citation context. Scientometrics, 101(2), 1293–1307.

    Article  Google Scholar 

  • Liu, R.-L., & Huang, Y.-C. (2011). Ranker enhancement for proximity-based ranking of biomedical texts. Journal of the American Society for Information Science and Technology, 62(12), 2479–2495.

    Article  Google Scholar 

  • Liu, X., Zhang, J., & Guo, C. (2013). Full-text citation analysis: A new method to enhance scholarly networks. Journal of the American Society for Information Science and Technology, 64(9), 1852–1863.

    Article  Google Scholar 

  • Nakov, P. I., Schwartz, A. S., & Hearst, M. (2004). Citances: Citation sentences for semantic analysis of bioscience text. In Proceedings of the SIGIR’04 workshop on search and discovery in bioinformatics (pp. 81–88).

  • Qin, J. (2000). Semantic similarities between a keyword database and a controlled vocabulary database: an investigation in the antibiotic resistance literature. Journal of the American Society for Information Science., 51(3), 166–180.

    Article  Google Scholar 

  • Ritchie, A., Teufel, S., & Robertson, S. (2008). Using terms from citations for IR: Some first results. In C. Macdonald, I. Ounis, V. Plachouras, I. Ruthven, & R. White (Eds.), Advances in information retrieval (Vol. 4956, pp. 211–221). Berlin: Springer.

    Chapter  Google Scholar 

  • Robertson, S. E., Walker, S., & Beaulieu, M. (1998). Okapi at TREC-7: automatic ad hoc, filtering, VLC and interactive. In Proceedings of the 7th text retrieval conference (TREC 7) (pp. 253–264). Gaithersburg, USA.

  • Salton, G., & Zhang, Y. (1986). Enhancement of text representations using related document titles. Information Processing and Management, 22(5), 385–394.

    Article  Google Scholar 

  • Small, H. G. (1973). Co-citation in the scientific literature: A new measure of relationship between two documents. Journal of the American Society for Information Science, 24(4), 265–269.

    Article  Google Scholar 

  • Small, H. (2011). Interpreting maps of science using citation context sentiments: A preliminary investigation. Scientometrics, 87(2), 373–388.

    Article  Google Scholar 

  • Thijs, B., Zhang, L., & Glänzel, W. (2015). Bibliographic coupling and hierarchical clustering for the validation and improvement of subject-classification schemes. Scientometrics, 105(3), 1453–1467.

    Article  Google Scholar 

  • van Eck, N. J., Waltman, L., Noyons, E. C., & Buter, R. K. (2010). Automatic term identification for bibliometric mapping. Scientometrics, 82(3), 581–596.

    Article  Google Scholar 

  • Whissell, J. S., & Clarke, C. L. A. (2013). effective measures for inter-document similarity. In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management (CIKM’13) (pp. 1361–1370).

  • Wiegers, T. C., Davis, A. P., Cohen, K. B., Hirschman, L., & Mattingly, C. J. (2009). Text mining and manual curation of chemical-gene-disease networks for the comparative toxicogenomics database (CTD). BMC Bioinformatics, 10, 326.

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by Ministry of Science and Technology, Taiwan (Grant ID: MOST 104-2221-E-320-005).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rey-Long Liu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, RL. A new bibliographic coupling measure with descriptive capability. Scientometrics 110, 915–935 (2017). https://doi.org/10.1007/s11192-016-2196-7

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-016-2196-7

Keywords

Mathematics Subject Classification

JEL Classification

Navigation