Skip to main content
Log in

The use of citation context to detect the evolution of research topics: a large-scale analysis

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

With the exponential increase in the number of published papers, discovering how topics evolve becomes increasingly important for anybody involved in research, including researchers, institutes, research funding bodies, and decision-makers. This study proposes a large-scale analysis of the evolution of biomedical and life sciences using the citation contexts of the collected papers, or more precisely their citing sentences. Using 64,350 papers published in PubMed Central between 2008 and 2018, we determined the research trends for ten research topics. Moreover, we studied how these topics evolve across countries and across the most common journals in biomedical and life sciences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://www.ncbi.nlm.nih.gov/pmc/

  2. https://www.nltk.org/

  3. https://pypi.org/project/geopy/

  4. https://pypi.org/project/geotext/

  5. https://radimrehurek.com/gensim/models/dtmmodel.html.

  6. https://pypi.org/project/geotext/

References

  • Abu-Jbara, A. and Ezra, J. and Radev, D. (2013). Purpose and polarity of citation: Towards NLP-based bibliometrics, Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), 596–606.

  • Abu-Jbara, A. and Radev, D. R. (2012). Reference scope identification in citing sentences, Proceedings of the conference of the North American chapter of the association for computational linguistics: human language technologies, Montreal, Canada, pp. 80–90.

  • Aljaber, B., Stokes, N., Bailey, J., & Pei, J. (2010). Document clustering of scientific texts using citation contexts. Information Retrieval, 13, 101–131.

    Article  Google Scholar 

  • Alvarez, M. H., & Gómez, J. M. (2016). Survey about citation context analysis: Tasks, techniques, and resources. Natural Language Engineering, 22, 327–349.

    Article  Google Scholar 

  • Athar, A. (2011). Sentiment analysis of citations using sentence structure-based features. In Proceedings of ACL conference (student session) (pp. 81–87).

  • Athar, A. (2014). Sentiment analysis of scientific citations, Technical Report, University of Cambridge, Computer Laboratory, (UCAM-CL-TR-856).

  • Athar, A., & Teufel, S. (2012). Context-enhanced citation sentiment detection. In Proceedings of HLT-NAACL, 597–601,

  • Bengisu, M. (2003). Critical and emerging technologies in materials, manufacturing, and industrial engineering: A study for priority setting. Scientometrics, 58, 473–487.

    Article  Google Scholar 

  • Blei, D. M. and Lafferty, J. (2006). Dynamic topic models, Proceedings of the 23rd International Conference on Machine Learning (ICML), 113–120.

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Bornmann, L., & Mutz, R. (2015). Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references. JASIST, 66, 2215–2222.

    Google Scholar 

  • Bu, Y., Wang, B., Huang, W. B., Che, S., & Huang, Y. (2018). Using the appearance of citations in full text on author co-citation analysis. Scientometrics, 116, 275–289.

    Article  Google Scholar 

  • Callon, M., Courtial, J. P., Turner, W. A., & Bauin, S. (1983). From translations to problematic networks: An introduction to co-word analysis. Social Science Information, 22, 191–235.

    Article  Google Scholar 

  • Chen, C. (2006). CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. Journal of the American Society for information Science and Technology, 57, 359–377.

    Article  Google Scholar 

  • Chen, X., Chen, J., Wu, D., Xie, Y., & Li, J. (2016). Mapping the research trends by co-word analysis based on keywords from funded project. Procedia Computer Science, 91, 547–555.

    Article  Google Scholar 

  • Chen, S. H., Huang, M. H., Chen, D. Z., & Lin, S. G. (2012). Detecting the temporal gaps of technology fronts: A case study of smart grid field. Technological Forecasting and Social Change, 79, 1705–1719.

    Article  Google Scholar 

  • Chen, B., Tsutsui, S., Ding, Y., & Ma, F. (2017). Understanding the topic evolution in a scientific domain: An exploratory study for the field of information retrieval. Informetrics, 11, 1175–1189.

    Article  Google Scholar 

  • Cobo, M. J., Chiclana, F., Collop, A., Oña, J., & Herrera-Viedma, E. (2014). A bibliometric analysis of the intelligent transportation systems research based on science mapping. IEEE Trans. Intelligent Transportation Systems, 15, 901–908.

    Article  Google Scholar 

  • Cobo, M. J., Martínez, M. A., Gutiérrez-Salcedo, M., Fujita, H., & Herrera-Viedma, E. (2015). 25 years at knowledge-based systems: A bibliometric analysis. Knowledge-Based Systems, 80, 3–13.

    Article  Google Scholar 

  • Dehdarirad, T., Villarroya, A., & Barrios, M. (2014). Research trends in gender differences in higher education and science: A co-word analysis. Scientometrics, 101, 273–290.

    Article  Google Scholar 

  • Garfield, E. (1963). Science citation index. Science Citation Index, 1.

  • Garfield, E. (1962). Can citation indexing be automated. Essays of an Information Scientist, 1, 84–90.

    Google Scholar 

  • Garfield, E. (1972). Citation analysis as a tool in journal evaluation: Journals can be ranked by frequency and impact of citations for science policy studies. Science, 178, 471–479.

    Article  Google Scholar 

  • Glänzel, W., & Thijs, B. (2012). Using ’core documents’ for detecting and labelling new emerging topics. Scientometrics, 91, 399–416.

    Article  Google Scholar 

  • Gordon, M. D., & Dumais, S. (1998). Using latent semantic indexing for literature based discovery. Journal of the American Society for Information Science, 49, 674–685.

    Article  Google Scholar 

  • Grifiths, T.L. & Steyvers, M. (2004). Finding scientific topics. In Proceedings of national academy of sciences 101 (Suppl. 1), USA, (pp. 5228–5235).

  • Guo, H., Weingart, S., & Börner, K. (2011). Mixed-indicators model for identifying emerging research areas. Scientometrics, 89, 421–435.

    Article  Google Scholar 

  • He, J., & Chen, C. (2018). Temporal representations of citations for understanding the changing roles of scientific publications. Frontiers in Research Metrics and Analytics, 3.

  • He, Q., Pei, J., Kifer, D., Mitra, P., & Giles, C. L. (2010). Context-aware citation recommendation. Proceedings of WWW Conference, 421–430,

  • Hu, C. P., Hu, J. M., Deng, S., & Liu, Y. (2013). A co-word analysis of library and information science in China. Scientometrics, 97, 369–382.

    Article  Google Scholar 

  • Hui, S. C., & Fong, A. C. M. (2004). Document retrieval from a citation database using conceptual clustering and co-word analysis. Information Review, 28, 22–32.

    Google Scholar 

  • Hu, J., & Zhang, Y. (2015). Research patterns and trends of Recommendation System in China using co-word analysis. Information Processing Management, 51, 329–339.

    Article  Google Scholar 

  • Jebari, C., Cobo, M. J., & Herrera-Viedma, E. (2018). A new approach for implicit citation extraction, proceedings of IDEAL conference (pp. 121–129). Spain: Madrid.

  • Jurgens, D., Kumar, S., Hoover, R., McFarland, D., & Jurafsky, D. (2018). Measuring the Evolution of a Scientific Field through Citation Frames. Transactions of the Association for Computational Linguistics, 6, 391–406.

    Article  Google Scholar 

  • Kajikawa, Y., & Takeda, Y. (2008). Structure of research on biomass and bio-fuels: A citation-based approach. Technological Forecasting and Social Change, 75, 1349–1359.

    Article  Google Scholar 

  • Kim, H., Jiang, X., & Ohno-Machado, L. (2011). Trends in biomedical informatics: most cited topics from recent years. JAMIA, 18, 166–170.

    Google Scholar 

  • Kostoff, R. N. (2001). Text mining using database tomography and bibliometrics: A review. Technological Forecasting and Social Change, 68, 223–253.

    Article  Google Scholar 

  • Kostoff, R. N., del Rio, J. A., Humenik, J. A., Garcia, E. O., & Ramirez, A. M. (2001). Citation mining: Integrating text mining and bibliometrics for research user profiling. Journal American Society Information Sciences Technology, 52, 1148–1156.

    Article  Google Scholar 

  • Larsen, P. O., & von Ins, M. (2010). The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index. Scientometrics, 84, 575–603.

    Article  Google Scholar 

  • Lee, B., & Jeong, Y. (2008). Mapping Korea’s national R&D domain of robot technology by using the co-word analysis. Scientometrics, 77, 3–19.

    Article  Google Scholar 

  • Li, L., Ding, G., Feng, N., Wang, M., & Ho, Y. (2009). Global stem cell research trend: Bibliometric analysis as a tool for mapping of trends from 1991 to 2006. Scientometrics, 80, 39–58.

    Article  Google Scholar 

  • Liu, S., Chen, C., Ding, K., Wang, B., Xu, K., & Li, Y. (2014). Literature retrieval based on citation context. Scientometrics, 101, 1293–1307.

    Article  Google Scholar 

  • López-Robles, J. R., Otegi-Olaso, J. R., Gómez, I. P., & Cobo, M. J. (2019). 30 years of intelligence models in management and business: A bibliometric review. International Journal of Information Management, 48, 22–38.

    Article  Google Scholar 

  • MacDonald, K. I., & Dressler, V. (2018). Using citation analysis to identify research fronts: A case study with the internet of things. Science and Technology Libraries, 37, 171–186.

    Article  Google Scholar 

  • Ma, S., Zhang, C., & Liu, X. (2020). A review of citation recommendation: from textual content to enriched context. Scientometrics, 122, 1445–1472.

    Article  Google Scholar 

  • Mohammad, S., Dorr, B., Egan, M., Hassan, A., Muthukrishan, P., Qazvinian, V., et al. (2009). (pp. 584–592) USA:

  • Moral-Munoz, J. A., Arroyo-Morales, M., Piper, B. F., Cuesta-Vargas, A. I., Díaz-Rodríguez, L., Cho, W. C. S., et al. (2018). Thematic trends in complementary and alternative medicine applied in cancer-related symptoms. Journal Data Information Science, 3, 1–19.

    Article  Google Scholar 

  • Morris, S. A., Yen, G., Wu, Z., & Asnake, B. (2003). Time line visualization of research fronts. Journal of the American Society for Information Science and Technology, 54, 413–422.

    Article  Google Scholar 

  • Muñoz-Leiva, F., Viedma-del-Jesús, M. I., Sánchez-Fernández, J., & López-Herrera, A. G. (2012). An application of co-word analysis and bibliometric maps for detecting the most highlighting themes in the consumer behaviour research from a longitudinal perspective. Quality & Quantity, 46, 1077–1095.

    Article  Google Scholar 

  • Murgado Armenteros, E. M., Gutiérrez Salcedo, M., Torres Ruiz, F. J., & Cobo, M. J. (2015). Analysing the conceptual evolution of qualitative marketing research through science mapping analysis. Scientometrics, 102, 519–557.

    Article  Google Scholar 

  • Ohniwa, R., Hibino, A., & Takeyasu, K. (2010). Trends in research foci in life science fields over the last 30 years monitored by emerging topics. Scientometrics, 85, 111–127.

    Article  Google Scholar 

  • Perez-Cabezas, V., Ruiz-Molinero, C., Carmona-Barrientos, I., Herrera-Viedma, E., Cobo, M. J., & Moral-Munoz, J. A. (2018). Highly cited papers in rheumatology: Identification and conceptual analysis. Scientometrics, 116, 555–568.

    Article  Google Scholar 

  • Qazvinian, V. & Radev, D. R. (2010). Identifying non-explicit citing sentences for citation based summarization. In Proceedings of the 48th annual meeting ACL. Uppsala, Sweden, pp. 555–564.

  • Reiss, T., Vignola-Gagne, E., Kukk, P., Glänzel, W., & Thijs, B. (2013). ERACEP- Emerging research topics and their coverage by ERC-supported projects. European Research Council: Technical Report.

  • Ritchie, A. (2009). Citation context analysis for information retrieval. UK: University of Cambridge.

    Google Scholar 

  • Sagar, A., Kademani, B. S., & Bhanumurthy, K. (2013). Research trends in agricultural science: A global perspective. Journal of Scientometric Research, 2, 185–201.

    Article  Google Scholar 

  • Schwartz, A. S. and Hearst, M. (2006). Summarizing key concepts using citation sentences, Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis, ser. BioNLP ’06. Stroudsburg, PA, USA: Association for Computational Linguistics, 134–135.

  • Shibata, N., Kajikawa, Y., Takeda, Y., Sakata, I., & Matsushima, K. (2011). Detecting emerging research fronts in regenerative medicine by the citation network analysis of scientific publications. Technological Forecasting and Social Change, 78, 274–282.

    Article  Google Scholar 

  • Smalheiser, N. R. (2001). Predicting emerging technologies with the aid of text-based data mining: the micro approach. Technovation, 21, 689–693.

    Article  Google Scholar 

  • Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24, 265–269.

    Article  Google Scholar 

  • Small, H. (2006). Tracking and predicting growth areas in science. Scientometrics, 68, 595–610.

    Article  Google Scholar 

  • Small, H. (2011). Interpreting maps of science using citation context sentiments: A preliminary investigation. Scientometrics, 87, 373–388.

    Article  Google Scholar 

  • Small, H., Boyack, K. W., & Klavans, R. (2014). Identifying emerging topics in science and technology. Research Policy, 43, 1450–1467.

    Article  Google Scholar 

  • Small, H., Tseng, H., & Patek, M. (2017). Discovering discoveries: Identifying biomedical discoveries using citation contexts. Journal of Informetrics, 11, 46–62.

    Article  Google Scholar 

  • Sugiyama, K., Kumar, T., Kan, M. Y., & Tripathi, R. C. (2010). Identifying citing sentences in research papers using supervised learning, Proceedings of International Conference on Information Retrieval and Knowledge Management (CAMP), Shah Alam (pp. 67–72). Malaysia: Selangor.

  • Sun, L., & Yin, Y. (2017). Discovering themes and trends in transportation research using topic modeling. Transportation Research Part C: Emerging Technologies, 77, 49–66.

    Article  Google Scholar 

  • Teufel, S. and Siddharthan, A. and Tidhar, D. (2006). An annotation scheme for citation function, Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, 80–87.

  • Upham, S., & Small, H. (2010). Emerging research fronts in science and technology: patterns of new knowledge development. Scientometrics, 83, 15–38.

    Article  Google Scholar 

  • Wang, Z. Y., Li, G., Li, C. Y., & Li, A. (2012). Research on the semantic-based co-word analysis. Scientometrics, 90, 855–875.

    Article  Google Scholar 

  • Yan, E., Chen, Z., & Li, K. (2020). The relationship between journal citation impact and citation sentiment: A study of 32 million citances in PubMed Central. Quantitative Science Studies, 1, 1–11.

    Article  Google Scholar 

  • Yu, D., Xu, Z., & Wang, W. (2018). Bibliometric analysis of fuzzy theory research in China: A 30-year perspective. Knowledge-Based Systems, 141, 188–199.

    Article  Google Scholar 

  • Zhang, Y., Chen, H., Lu, J., & Zhang, G. (2017). Detecting and predicting the topic change of Knowledge-based Systems: A topic-based bibliometric analysis from 1991 to 2016. Knowledge-Based Systems, 133, 255–268.

    Article  Google Scholar 

  • Zhang, G., Ding, Y., & Milojevic, S. (2013). Citation content analysis (cca): A framework for syntactic and semantic analysis of citation content. Journal of the American Society for Information Science and Technology, 64, 1490–1503.

    Article  Google Scholar 

  • Zitt, M., Ramanana-Rahary, S., & Bassecoulard, E. (2005). Relativity of citation performance and excellence measures: From cross-field to cross-scale effects of field-normalisation. Scientometrics, 63, 373–401.

    Article  Google Scholar 

Download references

Acknowledgements

This work has been supported by the Spanish Ministry of Science and Innovation under Grants PID2019-105381GA-I00 (iScience) and PID2019-103880RB-I00.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chaker Jebari.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jebari, C., Herrera-Viedma, E. & Cobo, M.J. The use of citation context to detect the evolution of research topics: a large-scale analysis. Scientometrics 126, 2971–2989 (2021). https://doi.org/10.1007/s11192-020-03858-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-020-03858-y

Keywords

Navigation