Skip to main content
Log in

Characterizing human summarization strategies for text reuse and transformation in literature review writing

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Citations are useful signals of information salience, but little research has identified the patterns of information selection, transformation, and organization that they espouse. This paper investigated the summarization strategies followed in the writing of literature review sections of information science research papers. We found that the summarization strategies followed are different for the two major styles of literature review writing, descriptive versus integrative literature reviews. Descriptive literature reviews, which focus on individual descriptions of research papers, are more likely to reference the Method and the Result sections of the cited paper and copy-paste text the referenced text. In contrast, integrative literature reviews, which synthesize the main ideas for many papers together, have more critiques and focus mainly on the Conclusion sections. These findings, based on a hand-annotated dataset, have the potential to scale up into a transformation-invariant neural architecture for scientific summarization that can generate different summaries of the input text with integrative or descriptive characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Abura’ed, A., Bravo, A., Chiruzzo, L., & Saggion, H. (2018). LaSTUS/TALN + INCO@ CL-SciSumm 2018-using regression and convolutions for cross-document semantic linking and summarization of scholarly literature. In Proceedings of the 3nd joint workshop on bibliometric-enhanced information retrieval and natural language processing for digital libraries (BIRNDL2018). Ann Arbor, Michigan (July 2018).

  • Bourner, T. (1996). The research process: Four steps to success. Research methods: guidance for postgraduates, Arnold, London, pp. 7–11.

  • Bradshaw, S. (2003). Reference directed indexing: Redeeming relevance for subject search in citation indexes. In International conference on theory and practice of digital libraries (pp. 499–510). Springer, Berlin, Heidelberg.

    Chapter  Google Scholar 

  • Bruce, C. S. (1994). Research students’ early experiences of the dissertation literature review. Studies in Higher Education,19(2), 217–229.

    Article  Google Scholar 

  • Buchanan, G., & McKay, D. (2017). The lowest form of flattery: characterising text re-use and plagiarism patterns in a digital library corpus. In Proceedings of the ACM/IEEE joint conference on digital libraries (pp. 1–10). IEEE.

  • Chubin, D. E., & Moitra, S. D. (1975). Content analysis of references: Adjunct or alternative to citation counting? Social Studies of Science,5(4), 423–441.

    Article  Google Scholar 

  • Citron, D. T., & Ginsparg, P. (2015). Patterns of text reuse in a scientific corpus. Proceedings of the National Academy of Sciences,112(1), 25–30.

    Article  Google Scholar 

  • Dijk, T. A. (1979). Macrostructures: An interdisciplinary study of global structures in discourse, interaction, and cognition. New York: L. Erlbaum Associates.

    Google Scholar 

  • Elkiss, A., Shen, S., Fader, A., Erkan, G., States, D., & Radev, D. (2008). Blind men and elephants: What do citation summaries tell us about a research article? Journal of the American Society for Information Science and Technology,59(1), 51–62.

    Article  Google Scholar 

  • Guo, Q., & Li, C. (2007). The research on the application of text clustering and natural language understanding in automatic abstracting. In Fourth international conference on fuzzy systems and knowledge discovery, 2007. FSKD 2007. (vol. 4, pp. 92–96). IEEE.

  • Hart, C. (1998). Doing a literature review. London: Sage.

    Google Scholar 

  • Jaidka, K., Chandrasekaran, M. K., Rustagi, S., & Kan, M. Y. (2018). Insights from CL-SciSumm 2016: The faceted scientific document summarization shared task. International Journal on Digital Libraries,19(2–3), 163–171.

    Article  Google Scholar 

  • Jaidka, K., Khoo, C., & Na, J. C. (2010). Imitating human literature review writing: an approach to multi-document summarization. In Proceedings of the international conference on asian digital libraries (pp. 116–119). Springer, Berlin, Heidelberg.

  • Jaidka, K., Khoo, C., & Na, J. C. (2013a). Deconstructing human literature reviews–a framework for multi-document summarization. In proceedings of the 14th European workshop on natural language generation (pp. 125–135).

  • Jaidka, K., Khoo, C. S. G., & Na, J. C. (2013b). Literature review writing: How information is selected and transformed. Aslib Proceedings,65(3), 303–325.

    Article  Google Scholar 

  • Jha, R., Jbara, A. A., Qazvinian, V., & Radev, D. R. (2017). NLP-driven citation analysis for scientometrics. Natural Language Engineering,23(1), 93–130.

    Article  Google Scholar 

  • Jing, H., & McKeown, K. R. (1999). The decomposition of human-written summary sentences. In Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval (pp. 129–136). ACM.

  • Jönsson, S. (2006). On academic writing. European Business Review,18(6), 479–490.

    Article  Google Scholar 

  • Kan, M. Y., Klavans, J. L., & McKeown, K. R. (2002). Using the annotated bibliography as a resource for indicative summarization. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02).

  • Khoo, C. S., Na, J. C., & Jaidka, K. (2011). Analysis of the macro-level discourse structure of literature reviews. Online Information Review,35(2), 255–271.

    Article  Google Scholar 

  • Knott, D. (1999). Writing an annotated bibliography. Retrieved January 2009. http://www.writing.utoronto.ca/advice/specific-types-of-writing/annotated-bibliography.

  • Liu, Y., Wang, X., Zhang, J., & Xu, H. (2008). Personalized PageRank based multi-document summarization. In IEEE international workshop on semantic computing and systems, 2008. WSCS’08. (pp. 169–173). IEEE.

  • Massey, A. (1996). Using the literature: 3 × 4 analogies. The Qualitative Report, 2(4). Retrieved from January 2009. http://www.nova.edu/ssss/QR/QR2-4/massey.html.

  • Mei, Q., & Zhai, C. (2008). Generating impact-based summaries for scientific literature. In Proceedings of the ACL conference on human language technologies (pp. 816–824). Association for Computational Linguistics.

  • Mohammad, S., Dorr, B., Egan, M., Hassan, A., Muthukrishan, P., Qazvinian, V., & Zajic, D. (2009). Using citations to generate surveys of scientific paradigms. In Proceedings of human language technologies: The 2009 annual conference of the north american chapter of the association for computational linguistics (pp. 584–592). Association for Computational Linguistics.

  • Nanba, H. (2000). Classification of research papers using citation links and citation types: Towards automatic review article generation. In Proceedings of the American Society for Information Science (ASIS)/the 11th SIG classification research workshop, classification for user support and learning, Chicago, USA, 2000 (pp. 117–134). Morgan Kaufmann Publishers.

  • Nanba, H., & Okumura, M. (1999). Towards multi-paper summarization reference information. In Proceedings of the 16th international joint conference on Artificial intelligence-Volume 2 (pp. 926–931). Morgan Kaufmann Publishers Inc.

  • Nanba, H., & Okumura, M. (2005). Automatic detection of survey articles. In International Conference on Theory and Practice of Digital Libraries (pp. 391–401). Springer, Berlin, Heidelberg.

  • Nomoto, T. (2016). NEAL: A neurally enhanced approach to linking citation and reference. In Proceedings of the joint workshop on bibliometric-enhanced information retrieval and natural language processing for digital libraries (BIRNDL) (pp. 168–174).

  • Qazvinian, V., & Radev, D. R. (2010). Identifying non-explicit citing sentences for citation-based summarization. In Proceedings of the 48th annual meeting of the association for computational linguistics (pp. 555–564). Association for Computational Linguistics.

  • Qazvinian, V., Radev, D. R., & Özgür, A. (2010). Citation summarization through keyphrase extraction. In Proceedings of the 23rd international conference on computational linguistics (pp. 895–903). Association for Computational Linguistics.

  • Rowley, J., & Slack, F. (2004). Conducting a literature review. Management research news,27(6), 31–39.

    Article  Google Scholar 

  • Rush, A. M., Chopra, S., & Weston, J. (2015). A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 conference on empirical methods in natural language processing (pp. 379–389).

  • Silva, F. N., Amancio, D. R., Bardosova, M., Costa, L. D. F., & Oliveira, O. N., Jr. (2016). Using network science and text analytics to produce surveys in a scientific topic. Journal of Informetrics,10(2), 487–502.

    Article  Google Scholar 

  • Singh, M., Niranjan, A., Gupta, D., Bakshi, N. A., Mukherjee, A., & Goyal, P. (2017). Citation sentence reuse behavior of scientists: A case study on massive bibliographic text dataset of computer science. In Proceedings of the ACM/IEEE joint conference on digital libraries (JCDL) (pp. 1–4). IEEE.

  • Tandon, N., & Jain, A. (2012). Citation context sentiment analysis for structured summarization of research papers. In 35th German conference on artificial intelligence (p. 98).

  • Teufel, S. (1999). Argumentative Zoning: Information Extraction from scientific text. Ph.D. Thesis, University of Edinburgh.

  • Teufel, S., Carletta, J., & Moens, M. (1999). An annotation scheme for discourse-level argumentation in research articles. In Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics (pp. 110–117). Association for Computational Linguistics.

  • Torraco, R. J. (2005). Writing integrative literature reviews: Guidelines and examples. Human Resource Development Review,4(3), 356–367.

    Article  Google Scholar 

  • Toulmin, S. E. (2003). The uses of argument. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Yasunaga, M., Kasai, J., Zhang, R., Dan, A. R. F. I. L., & Radev, F. D. R. (2019). ScisummNet: A large annotated corpus and content-impact models for scientific paper summarization with citation networks. In Proceedings of the AAAI annual meeting.

  • Zhang, Y., Barzilay, R., & Jaakkola, T. (2017). Aspect-augmented adversarial networks for domain adaptation. arXiv preprint arXiv:1701.00188.

  • Zhao, J. J., Kim, Y., Zhang, K., Rush, A. M., & LeCun, Y. (2017). Adversarially regularized autoencoders for generating discrete structures. CoRR, abs/1706.04223.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kokil Jaidka.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 172 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jaidka, K., Khoo, C.S.G. & Na, JC. Characterizing human summarization strategies for text reuse and transformation in literature review writing. Scientometrics 121, 1563–1582 (2019). https://doi.org/10.1007/s11192-019-03250-5

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-019-03250-5

Keywords

Mathematics Subject Classification

Navigation