skip to main content
10.1145/3632754.3632758acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfireConference Proceedingsconference-collections
research-article

Keyword Driven Language-Independent Low-Resource Graph-Based Automatic Text Summarization of Konkani Texts

Published:12 February 2024Publication History

ABSTRACT

Automatic text summarization has been widely researched for the most popularly spoken languages. There is a need to extend research to other less popular and low-resource languages as well. We explore automatic text summarization for Konkani language. It is a language spoken by a relatively small population of people in the state of Goa, India, and is a low-resource language. A low-resource language has limited language processing tools, which add to the challenges faced in the automatic text summarization process. This study aims at extending the popular graph-based language-independent approach using language-independent keywords with the help of YAKE. These keywords allow us to weight the edges in a fully connected undirected graph by not only considering the influence of two vertices (sentences) but also the sentences with the overall document, represented by the keywords. Variable thresholds of relevant keywords are examined along with altering a bias parameter. The impact of these keyword thresholds along with the bias parameter is examined on the system generated summary.

References

  1. Narendra Andhale and Laxmi Bewoor. 2016. An overview of Text Summarization techniques. International Conference on Computing Communication Control and Automation (ICCUBEA), 12-13 (Aug). doi:10.1109/iccubea.2016.7860024.Google ScholarGoogle ScholarCross RefCross Ref
  2. N. Moratanch and S. Chitrakala. 2017. A survey on extractive text summarization. International Conference on Computer, Communication and Signal Processing (ICCCSP), 10-11 (Jan). doi:10.1109/icccsp.2017.7944061Google ScholarGoogle ScholarCross RefCross Ref
  3. Elena Lloret and Manuel Palomar. 2012. Text summarisation in progress: a literature review. Springer, Springer. 1–41.Google ScholarGoogle Scholar
  4. Shohreh Rad Rahimi, Ali Toofanzadeh Mozhdehi, and Mohamad Abdolahi. 2017. An overview on extractive text summarization. IEEE 4th International Conference on Knowledge-Based Engineering and Innovation (KBEI). doi:10.1109/kbei.2017.8324874Google ScholarGoogle ScholarCross RefCross Ref
  5. Hmida Firas. 2014. Language Independent Summarization Approaches. Innovative Document Summarization Techniques: Revolutionizing Knowledge Understanding 2014. LINA Nantes-University, France. DOI: 10.4018/978-1-4666-5019-0.ch013Google ScholarGoogle ScholarCross RefCross Ref
  6. Ahmed Abdelfattah Salehand Li Weigang. 2017. Language independent text summarization of western European languages using shape coding of text elements. 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) 2017. doi:10.1109/fskd.2017.8393116Google ScholarGoogle ScholarCross RefCross Ref
  7. Ahmad T. Al-Taani. 2017. Automatic text summarization approaches. International Conference on Infocom Technologies and Unmanned Systems, Trends and Future Directions (ICTUS). doi:10.1109/ictus.2017.8285983Google ScholarGoogle ScholarCross RefCross Ref
  8. Sergey Brin, Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine, In Computer Networks and ISDN Systems, Vol. 30. 1-7.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Rada Mihalcea. 2005. Language Independent Extractive Summarization. Annual Meeting of the Association for Computational Linguistics.Google ScholarGoogle Scholar
  10. Rada Mihalcea. 2004. Graph-based ranking algorithms for sentence extraction, applied to text summarization. Proceedings of the ACL 2004 on Interactive poster and demonstration sessions.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Text. Proceedings of the Conference on Empirical Methods on Natural Language Processing(EMNLP), 404-411 (2004).Google ScholarGoogle Scholar
  12. Ramesh Chandra Belwal, Sawan Rai, and Atul Gupta. 2020. A new graph-based extractive text summarization using keywords or topic modelling. Journal of Ambient Intelligence and Humanized Computing. Springer-Verlag GmbH Germany, part of Springer Nature 2020. 8975–8990. https://doi.org/10.1007/s12652-020-02591-x.Google ScholarGoogle ScholarCross RefCross Ref
  13. Jovi D'Silva and Uzzal Sharma. 2022. Explorations in Graph-based Ranking Algorithms for Automatic Text Summarization on Konkani Texts. International Conference on Sustainable Advanced Computing- ICSAC 2021. Sustainable Advanced Computing. Chapter 4. doi: 10.1007/978-981-16-9012-9_4.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jovi D'Silva and Uzzal Sharma. 2023. Impact of Similarity Measures in Graph-based Automatic Text Summarization of Konkani Texts. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 22, 2, Article 51 (Feb). https://doi.org/10.1145/3554943Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ricardo Campos, Vitor Mangaravite, Arian Pasquali, Alipio Jorge, Celia Nunes, and Adam Jatowt. 2020. YAKE! Keyword Extraction from Single Documents using Multiple Local Features. Information Sciences Journal, Elsevier. Vol. 509. 257-289. ISSN 0020-0255. https://doi.org/10.1016/j.ins.2019.09.013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Chin-Yew Lin. 2004. ROUGE: a Package for Automatic Evaluation of Summaries. Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), Barcelona, Spain. 74–81. (July).Google ScholarGoogle Scholar
  17. Jovi D'Silva and Uzzal Sharma. 2019. Development of a Konkani Language Dataset for Automatic Text Summarization and its Challenges. International Journal of Engineering Research and Technology, International Research Publication House. Vol 12. No 10. ISSN 0974-3154. 18913-18917.Google ScholarGoogle Scholar
  18. Statement –4: Distribution of Population by Schedule and Other Languages India, States and Union Territories – 2011, Office of the Registrar General & Census Commissioner, India, Ministry of Home Affairs, Government of India, 2011, https://censusindia.gov.in/2011Census/Language-2011/Statement-4.pdf.Google ScholarGoogle Scholar
  19. Gunes Erkan and Dragomir R. Radev. 2004. Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22. 457-479. (Jul).Google ScholarGoogle ScholarCross RefCross Ref
  20. Khushboo Thakkar, Rajiv V. Dharaskar, and Manoj Chandak. 2010. Graph-Based Algorithms for Text Summarization. 3rd International Conference on Emerging Trends in Engineering & Technology. International Conference. 516-519. doi: 10.1109/ICETET.2010.104.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shanmugasundaram Hariharan and Srinivasan Rengaramanujam. 2010. Enhancements to Graph Based Methods for Single Document Summarization. IACSIT International Journal of Engineering and Technology. 2. No.1. ISSN: 1793-8236 (Feb).Google ScholarGoogle Scholar
  22. Nitin Agrawal, Shikhar Sharma, Prashant Sinha, and Shobha Bagai. 2015. A Graph Based Ranking Strategy for Automated Text Summarization. DU Journal of Undergraduate Research and Innovation (Feb).Google ScholarGoogle Scholar
  23. Vimal Kumar K, Divakar Yadav and Arundhati Sharma. 2015. Graph Based Technique for Hindi Text Summarization. In: Mandal, J., Satapathy, S., Kumar Sanyal, M., Sarkar, P., Mukhopadhyay, A. (eds) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 339. Springer 2015, New Delhi. https://doi.org/10.1007/978-81-322-2250-7_29.Google ScholarGoogle ScholarCross RefCross Ref
  24. Kanitha D K, D. Muhammad Noorul Mubarak, and S. A. Shanavas . 2018. Malayalam Text Summarization Using Graph Based Method. (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 9. 2. 40-44.Google ScholarGoogle Scholar
  25. Vaishali Sarwadnya and Sheetal Sonawane. 2018. Marathi Extractive Text Summarizer Using Graph Based Model. Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). doi:10.1109/iccubea.2018.8697741.Google ScholarGoogle ScholarCross RefCross Ref
  26. K. Usha Manjari. 2020. Extractive Summarization of Telugu Documents using TextRank Algorithm. 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC). 678-683. doi: 10.1109/I-SMAC49090.2020.9243568.[1 (2020).Google ScholarGoogle ScholarCross RefCross Ref
  27. Kishore Kumar Mamidala and Suresh Kumar Sanampud. 2021. A Heuristic Approach for Telugu Text Summarization with Improved Sentence Ranking. Turkish Journal of Computer and Mathematics Education. Vol.12 No.3, 4238-4243.Google ScholarGoogle Scholar
  28. Hans Peter Luhn. 1958. The automatic creation of literature abstracts. IBM Journal of research and development 2, 2 (1958), 159–165.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Document Understanding Conferences (DUC), National Institute of Standards and Technology (NIST), Available: https://www-nlpir.nist.gov/projects/duc/guidelines/2002.html.Google ScholarGoogle Scholar
  30. Jovi D'Silva, Uzzal Sharma and Chaitali More. 2023. Automatic Text Summarization of Konkani Texts Using Latent Semantic Analysis. In: Gupta, D., Khanna, A., Bhattacharyya, S., Hassanien, A.E., Anand, S., Jaiswal, A. (eds) International Conference on Innovative Computing and Communications. Lecture Notes in Networks and Systems, Vol. 473. Springer, Singapore. doi: https://doi.org/10.1007/978-981-19-2821-5_37.Google ScholarGoogle ScholarCross RefCross Ref
  31. Josef Steinberger and Karel Ježek. 2004. Using Latent Semantic Analysis in Text Summarization and Summary Evaluation. Proceedings of the 7th International Conference ISIM.Google ScholarGoogle Scholar
  32. Mišo Belica. 2018. sumy. GitHub repository. Retrieved November 20, 2023 from https://github.com/miso-belica/sumy.Google ScholarGoogle Scholar

Index Terms

  1. Keyword Driven Language-Independent Low-Resource Graph-Based Automatic Text Summarization of Konkani Texts

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      FIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation
      December 2023
      170 pages

      Copyright © 2023 ACM

      Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 February 2024

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate19of64submissions,30%
    • Article Metrics

      • Downloads (Last 12 months)10
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format