Skip to main content

Towards an Arabic Text Summaries Evaluation Based on AraBERT Model

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 446))

Abstract

The evaluation of text summaries remains a challenging task despite the large number of studies in this field for more than two decades. This paper describes an automatic method for assessing Arabic text summaries. In fact, the proposed method will predict the “Overall Responsiveness” manual score, which is a combination of the content and the linguistic quality of a summary. To predict this manual score, we aggregate, with a regression function, three types of features: lexical similarity features, semantic similarity features and linguistic features. Semantic features include multiple semantic similarity scores based on Bert model. While linguistic features are based on the calculation of entropy scores. To calculate the similarity between a candidate summary and a reference summary, we begin by doing an exact match between n-grams. For the unmatched n-grams, we present them as Bert vectors, and then we compute the similarity between Bert vectors. The proposed method yielded competitive results compared to metrics based on lexical similarity such as ROUGE.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL, Barcelona, Spain, pp.74–81 (2004)

    Google Scholar 

  • Giannakopoulos, G., Karkaletsis, V.: AutoSummENG and MeMoG in evaluating guided summaries. In: Proceedings of the Text Analysis Conference (TAC) (2011)

    Google Scholar 

  • Cabrera-Diego, L.A., Torres-Moreno, J.: Summtriver: a new trivergent model to evaluate summaries automatically without human references. Data Knowl. Eng. 113, 184–197 (2018)

    Article  Google Scholar 

  • Pitler, E., Nenkova, A.: Revisiting readability: a unified framework for predicting text quality. In: Proceedings of the Empirical Methods in Natural Language Processing (EMNLP), pp. 186–195 (2008)

    Google Scholar 

  • Pitler, E., Louis, A., Nenkova, A.: Automatic evaluation of linguistic quality in multi-document summarization. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 544–554 (2010)

    Google Scholar 

  • de S. Dias, M., Feltrim, V.D., Pardo, T.A.S.: Using rhetorical structure theory and entity grids to automatically evaluate local coherence in texts. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Volpe Nunes, M.D.G. (eds.) PROPOR 2014. LNCS (LNAI), vol. 8775, pp. 232–243. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09761-9_26

    Chapter  Google Scholar 

  • Ellouze, S., Jaoua, M., Belguith, L.H.: Automatic evaluation of a summary’s linguistic quality. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds.) NLDB 2016. LNCS, vol. 9612, pp. 392–400. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41754-7_39

    Chapter  Google Scholar 

  • Xenouleas, S., Malakasiotis, P., Apidianaki M., Androutsopoulos I.: Sum-QE: a BERT-based summary quality estimation model. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 6004–6010 (2019)

    Google Scholar 

  • Lin, Z., Liu, C., Ng, H.T., Kan, M.Y.: Combining coherence models and machine translation evaluation metrics for summarization evaluation. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1006–1014 (2012)

    Google Scholar 

  • Ellouze, S., Jaoua, M., Hadrich Belguith, L.: Mix multiple features to evaluate the content and the linguistic quality of text summaries. J. Comput. Inf. Technol. 25(2), 149–166 (2017)

    Google Scholar 

  • Wang, X., Liu, B., Shen, L., Li, Y., Gu, R., Qu, G.: A summary evaluation method combining linguistic quality and semantic similarity. In: Proceedings of 2020 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 637–642 (2020). https://doi.org/10.1109/CSCI51800.2020.00113

  • Elghannam, F., El-Shishtawy, T.: Keyphrase based evaluation of automatic text summarization. Int. J. Comput. Appl. 117(7), 5–8 (2015)

    Google Scholar 

  • Ellouze, S., Jaoua, M., Hadrich Belguith, L.: Arabic text summary evaluation method. In: Proceedings of the International Business Information Management Association Conference-Education Excellence and Innovation Management through Vision2020: From Regional Development Sustainability to Global Economic Growth, pp. 3532–3541 (2017)

    Google Scholar 

  • Attia, M.: Handling Arabic morphological and syntactic ambiguities within the LFG framework with a view to machine translation. Ph.D. dissertation, University of Manchester (2008)

    Google Scholar 

  • Farghaly, A., Shaalan, K.: Arabic natural language processing: challenges and solutions. ACM Trans. Asian Lang. Inf. Process. 8(4) (2009). https://doi.org/10.1145/1644879.1644881. Article 14, 22 pages

  • Farghaly, A.: Subject pronoun deletion rule. In: Proceedings of the English Language Symposium on Discourse Analysis (LSDA 1982), pp. 110–117 (1982)

    Google Scholar 

  • Hovy, E., Lin, C., Zhou, L., Fukumoto, J.: Automated summarization evaluation with basic elements. In: Proceedings of the Conference on Language Resources and Evaluation, pp. 899–902 (2006)

    Google Scholar 

  • Tratz, S., Hovy, E.: BEwTE: basic elements with transformations for evaluation. In: Proceedings of Text Analysis Conference (TAC) Workshop (2008)

    Google Scholar 

  • Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: BERTScore: evaluating text generation with BERT. In: Proceedings of the International Conference on Learning Representations (ICLR) (2020)

    Google Scholar 

  • Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: The Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186 (2019)

    Google Scholar 

  • Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Quoc, V.Le.: XLNet: generalized autoregressive pretraining for language understanding. In: Proceedings of the International Conference on Neural Information Processing Systems, pp. 5753–5763 (2019)

    Google Scholar 

  • Giannakopoulos, G., Karkaletsis V.: Summary evaluation: together we stand NPowER-ed. In: Proceedings of International Conference on Computational Linguistics and Intelligent Text Processing, vol. 2, pp. 436–450 (2013)

    Google Scholar 

  • Bentz, C., Alikaniotis, D., Cysouw, M., Ferrer-i-Cancho, R.: The entropy of words—learnability and expressivity across more than 1000 languages. Entropy 19(6), 275 (2017)

    Google Scholar 

  • Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  • Giannakopoulos, G., El-Haj, M., Favre, B., Litvak, M., Steinberger, J., Varma, V.: TAC 2011 multiling pilot overview. In: Proceedings of the Fourth Text Analysis Conference (2011)

    Google Scholar 

  • Giannakopoulos, G.: Multi-document multi-lingual summarization and evaluation tracks in ACL’acl 2013 multiling workshop’. In: Proceedings of the MultiLing 2013 Workshop on Multilingual Multi-document Summarization, pp. 20–28 (2013)

    Google Scholar 

  • Louis, A., Nenkova, A.: Automatically assessing machine summary content without a gold standard. Comput. Linguist. 39(2), 267–300 (2013). https://doi.org/10.1162/COLI_a_00123

  • Antoun, W., Baly, F., Hajj, H.: AraBERT: Transformer-based model for Arabic language understanding. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pp. 9–15 (2020)

    Google Scholar 

  • Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002). https://doi.org/10.1023/A:1012487302797

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samira Ellouze .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ellouze, S., Jaoua, M. (2022). Towards an Arabic Text Summaries Evaluation Based on AraBERT Model. In: Guizzardi, R., Ralyté, J., Franch, X. (eds) Research Challenges in Information Science. RCIS 2022. Lecture Notes in Business Information Processing, vol 446. Springer, Cham. https://doi.org/10.1007/978-3-031-05760-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-05760-1_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-05759-5

  • Online ISBN: 978-3-031-05760-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics