Skip to main content

Advertisement

Log in

Enhanced automatic abstractive document summarization using transformers and sentence grouping

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Automatic document summarization is a widely studied field that aims to generate brief and informative summaries of long documents. In this paper, we proposed a hybrid approach for automatic document summarization using Transformer and sentence grouping. The Transformer model was created by training with BBC News dataset. We first preprocessed this dataset by correcting logical and spelling errors in the original full-text and summary document pairs. The preprocessed dataset was used to train a Transformer model with hyper-parameters that were determined through experimentation. In the testing stage, the documents were decomposed into sentences, and the similarities of each sentence with other sentences were calculated using the Simhash text similarity algorithm. The most similar sentences were grouped together, with the number of groups set to be 25% of the total number of sentences in the document. These groups of sentences were then input into the Transformer model, which produced abstractive new sentences for each group. To determine the order of the groups, the average position of their sentences in the original document was calculated. Finally, the Transformer model generated abstractive sentences, which were combined into a summary. Experimental results showed that the proposed approach achieved an average of 93.2% Simhash text similarity to the original full-text documents and an average of 5% more similarity to the original summary documents. These results demonstrated the effectiveness of the proposed approach in the automatic document summarization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1

Similar content being viewed by others

Data availability

Not Applicable.

Code availability

The code will be available through a GitHub repository.

References

  1. Aghajanyan A, Yousefi-Azar A, Cho Y (2021) PreSumm: a transformer-based model for abstractive and extractive summarization. IEEE Access 9:6267–6279

    MATH  Google Scholar 

  2. Ahuir V, Gonzalez J-A, Hurtado L-F et al (2024) Abstractive summarizers become emotional on news summarization. Appl Sci 14(2):713

    Article  Google Scholar 

  3. Anderson P, He X, Buehler C et al (2018) Bottom-up and top-down attention for image captioning and visual question answering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6077–6086. Salt Lake City, UT, USA,

  4. Berker M, Gungor T (2012) “Using genetic algorithms with lexical chains for automatic text summarization.” 1, 595–600

  5. Bhargava R, Sharma Y, Sharma G (2016) ATSSI: abstractive text summarization using sentiment infusion. Procedia Computer Sci 89(1):404–411

    Article  MATH  Google Scholar 

  6. Bhatia SS, Kala AK, Bhattacharyya K (2014) “Sentiment analysis: a review of current research.” 2014 IEEE symposium on computational intelligence in big data (CIBD), 67–72. IEEE, Orlando, FL, USA,

  7. Blekanov IS, Tarasov N, Bodrunova SS (2022) Transformer-based abstractive summarization for reddit and twitter: single posts vs. comment pools in three languages. Future Internet 14(3):69

    Article  MATH  Google Scholar 

  8. Brown PF, Della Pietra VJ, deSouza PV et al (1992) Class-based n-gram models of natural language. Comput Linguistics 18(4):467–479

    MATH  Google Scholar 

  9. Brown TB, Mann B, Ryder N et al (2020) “language models are few-shot learners.” Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS’20. Curran Associates Inc., Red Hook, NY, USA,

  10. Cai X, Liu S, Han J et al (2021) ChestXRayBERT: a pretrained language model for chest radiology report summarization. IEEE Transactions on Multimedia 1–1

  11. Chaudhari NV, Thakur SS, Waghmare KS (2019) WordNet-based spell checker for Marathi language.” Proceedings of the International Conference on Communication, Computing and Networking (ICCCN), 1–6. Bangalore, India

  12. Chen L, Zhang H, Xiao J et al (2020) Image captioning with graph attention networks for object relation reasoning. IEEE Trans Multimed 22(4):930–942

    MATH  Google Scholar 

  13. Chen T, Lu Y, Zhang S et al (2020) Momentum2: a new momentum optimizer for faster convergence.” Proceedings of the 28th ACM International Conference on Multimedia, 3033–3035

  14. Chen T, Wang X, Yue T et al (2023) Enhancing abstractive summarization with extracted knowledge graphs and multi-source transformers. Appl Sci 13(13):22

    Article  MATH  Google Scholar 

  15. Chen Y, Song Q (2021) News text summarization method based on bart-textrank model. 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), 2005–2010

  16. Chiang D (2005) A hierarchical phrase-based model for statistical machine translation.” Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, 263–270. Ann Arbor, Michigan,

  17. Collins M, Koo T (2002) Discriminative reranking for machine translation.” Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 231–238. Philadelphia, Pennsylvania,

  18. Dai Z, Yang Z, Yang Y et al (2019) Transformer-XL: attentive language models beyond a fixed-length context.” Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 2978–2988. Association for computational linguistics

  19. Devlin J, Chang M-W, Lee K et al (2019) “BERT: pre-training of deep bidirectional transformers for language understanding.” Proc North Am Chapter Assoc Comput Linguist 4171–4186. Minneapolis, MN, USA

  20. Erkan G, Radev DR (2004) LexRank: graph-based lexical centrality as salience in text summarization. J Artif Int Res 22(1):457–479

    MATH  Google Scholar 

  21. Gupta V, Lehal G (2010) A survey of text summarization extractive techniques. J Emerg Technol Web Intell 2:08

    MATH  Google Scholar 

  22. Hildebrandt J, Berthold A, Habich D et al (2021) LCTL: lightweight compression template library.” 2021 IEEE International Conference on Big Data (Big Data), 2966–2975

  23. Jiang D, Cao S, Yang S (2021) Abstractive summarization of long texts based on BERT and sequence-to-sequence model. 2021 2nd International Conference on Information Science and Education (ICISE-IE), 460–466. IEEE

  24. Karousos N, Vorvilas G, Pantazi D et al (2024) A hybrid text summarization technique of student open-ended responses to online educational surveys. Electronics 13(18):3722

    Article  Google Scholar 

  25. Keskar N, McCann B, Varshney L et al (2019) “Ctrl”: A conditional transformer language model for controllable generation. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 6408–6419. Florence, Italy

  26. Kimura T, Tagami R, Miyamori H (2019) Query-focused summarization enhanced with sentence attention mechanism. 2019 IEEE International Conference on Big Data and Smart Computing (BigComp), 1–8

  27. Koehn P, Och FJ, Marcu D (2003) “Statistical phrase-based translation.” Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1, 48–54. Edmonton, Canada

  28. Kumar A, Jain A (2019) Accelerating convergence of stochastic optimization using a novel adaptive learning rate schedule. 2019 IEEE Symposium Series on Computational Intelligence (SSCI), 2370–2377

  29. Law H, Deng J (2019) “CornerNet-Lite: efficient keystone detection for real-time object detection.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 3559–3568. Long Beach, CA, USA

  30. Le HT, Le TM (2013) An approach to abstractive text summarization.” 2013 International Conference on Soft Computing and Pattern Recognition (SoCPaR), 371–376. IEEE,

  31. Li Y, Wehbe R, Ahmad F et al (2022) Clinical-longformer and clinical-bigbird: transformers for long clinical sequences.” arXiv preprint arXiv:2203.00738,

  32. Liu B (2012) Sentiment analysis and opinion mining, volume 12 of Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers,

  33. Liu J, Li Z, Li X et al (2021) BERT-based document-level relation extraction with multi-granularity information. IEEE Access 9:104134–104145

    MATH  Google Scholar 

  34. López Espejel J (2021) Automatic abstractive summarization of long medical texts with multi-encoders transformer and general-domain summary evaluation with wikiSERA”. arXiv preprint arXiv:2105.04538

  35. Meena SM, Ramkumar MP, Asmitha R et al (2020) Text summarization using text frequency ranking sentence prediction. 2020 2nd International Conference on Computational Communication and Signal Processing (ICCCSP), 1–5

  36. Mishra R, Dave A (2016) “WordNet based spell checker for Gujarati.” Proceedings of the International Conference on Computer Communication and Informatics (ICCCI), 1–5. Coimbatore, India

  37. Moawad IF, Aref M (2012) “Semantic graph reduction approach for abstractive Text Sum-marization.” 2012 Seventh International Conference on Computer Engineering & Systems (ICCES), 132–138. IEEE

  38. Moro G, Ragazzi L, Valgimigli L et al (2023) efficient memory-enhanced transformer for long-document summarization in low-resource regimes. Sensors 23(7):3542

    Article  MATH  Google Scholar 

  39. Naik SS, Gaonkar MN (2017) Extractive text summarization by feature-based sentence extraction using rule-based concept. 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), 1364–1368. IEEE

  40. Nallapati R, Zhou B (2016) Abstractive text summarization using sequence-to-sequence RNN”s and beyond. Proceedings of the 20th Conference on Computational Natural Language Learning, 280–290. Association for Computational Linguistics, Berlin, Germany

  41. Pariza A (2023) BBC News Summary. https://www.kaggle.com/pariza/bbc-news-summary, accessed April 11,

  42. Pembe S (2007) A query-based and structural feature-based summary system for web searches.” Proceedings of the International Conference on Machine Learning and Cybernetics, 1780–1785. Hong Kong, China

  43. Radford A, Wu J, Child R et al (2019) GPT-2: Language generation with generative pre-training. Technical report OpenAI, USA

    MATH  Google Scholar 

  44. Radford A, Wu J, Child R et al (2019) Language models are unsupervised multitask learners. Tech rep, USA

    Google Scholar 

  45. Raffel C, Shazeer N, Roberts A et al (2019) Exploring the limits of transfer learning with a unified text-to-text transformer.” arXiv preprint arXiv:1910.10683

  46. Reda A, Abdelsadek M, Alshraideh H et al (2022) A hybrid arabic text summarization approach based on transformers.” 2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), 56–62

  47. Rhazzafe S, Caraffini F, Colreavy-Donnelly S et al (2024) Hybrid summarization of medical records for predicting length of stay in the intensive care unit. Appl Sci 14(13):5809

    Article  MATH  Google Scholar 

  48. Shaw P, Uszkoreit J, Vaswani A (2018) Self-attention with relative position representations. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 4648–4658. Brussels, Belgium

  49. Shirwandkar NS, Kulkarni S (2018) “Extractive text summarization using deep learning.” 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA) 1–5. IEEE

  50. Singh BR, Yadav A (2018) WordNet-based spell checker for Hindi.” Proceedings of the International Conference on Computer Networks, Big Data and IoT (ICCBI), 1–6. Shillong, India

  51. Suleiman D, Awajan AA (2019) Deep learning based extractive text summarization: approaches, datasets and evaluation measures.” 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS), 204–210. IEEE

  52. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems (editors Z. Ghahramani, M. Welling, C. Cortes, et al.), 27 1–10. Curran Associates, Inc.,

  53. Tang A, Tam R, Cadrin-Chênevert A et al (2018) Canadian association of radiologists white paper on artificial intelligence in radiology. Can Assoc Radiolog J 69(2):120–135 (PMID: 29655580)

    Article  Google Scholar 

  54. Tian Y, Krishnan D, Isola P (2020) Learning to detect and track objects in motion.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6974–6983. Seattle, WA, USA

  55. Trust P, Minghim R (2024) A study on text classification in the age of large language models. Mach Learn Knowl Extract 6(4):2688–2721

    Article  MATH  Google Scholar 

  56. Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. arXiv preprint arXiv:1706.03762

  57. Yan J, Chao W (2015) English language statistical machine translation oriented classification algorithm. 2015 International Conference on Intelligent Transportation, Big Data and Smart City, 376–379

  58. Yang J, Zhang L, He Y et al (2021) GPT-3: language models are few-shot learners. IEEE Trans Neural Netw Learn Syst 32(4):1197–1207

    Article  MATH  Google Scholar 

  59. Zafarani R, Liu H, Abbasi M (2021) Social media mining. Handbook of Natural Language Processing (editors N. Indurkhya and F. J. Damerau), 705–729. CRC Press, Boca Raton, FL, 2nd edition,

  60. Zhang J, Zhao Y, Saleh M et al (2019) PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. arXiv preprint arXiv:1912.08777

Download references

Acknowledgements

Not applicable.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

Ahmet Toprak contributed to conceptualization, data collection, formal analysis, software development, validation, visualization, writing an original draft, writing-review editing. Metin Turan contributed to conceptualization, data collection, formal analysis, software development, validation, visualization, writing an original draft, writing-review editing.

Corresponding author

Correspondence to Ahmet Toprak.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval and consent to participate

Not Applicable.

Human and animal rights

Not applicable.

Consent for publication

Yes, we consent this paper to be published in the Journal of Intelligent Information System.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Toprak, A., Turan, M. Enhanced automatic abstractive document summarization using transformers and sentence grouping. J Supercomput 81, 557 (2025). https://doi.org/10.1007/s11227-025-07048-6

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-025-07048-6

Keywords