ABSTRACT
Automatic text summarization has been widely researched for the most popularly spoken languages. There is a need to extend research to other less popular and low-resource languages as well. We explore automatic text summarization for Konkani language. It is a language spoken by a relatively small population of people in the state of Goa, India, and is a low-resource language. A low-resource language has limited language processing tools, which add to the challenges faced in the automatic text summarization process. This study aims at extending the popular graph-based language-independent approach using language-independent keywords with the help of YAKE. These keywords allow us to weight the edges in a fully connected undirected graph by not only considering the influence of two vertices (sentences) but also the sentences with the overall document, represented by the keywords. Variable thresholds of relevant keywords are examined along with altering a bias parameter. The impact of these keyword thresholds along with the bias parameter is examined on the system generated summary.
- Narendra Andhale and Laxmi Bewoor. 2016. An overview of Text Summarization techniques. International Conference on Computing Communication Control and Automation (ICCUBEA), 12-13 (Aug). doi:10.1109/iccubea.2016.7860024.Google ScholarCross Ref
- N. Moratanch and S. Chitrakala. 2017. A survey on extractive text summarization. International Conference on Computer, Communication and Signal Processing (ICCCSP), 10-11 (Jan). doi:10.1109/icccsp.2017.7944061Google ScholarCross Ref
- Elena Lloret and Manuel Palomar. 2012. Text summarisation in progress: a literature review. Springer, Springer. 1–41.Google Scholar
- Shohreh Rad Rahimi, Ali Toofanzadeh Mozhdehi, and Mohamad Abdolahi. 2017. An overview on extractive text summarization. IEEE 4th International Conference on Knowledge-Based Engineering and Innovation (KBEI). doi:10.1109/kbei.2017.8324874Google ScholarCross Ref
- Hmida Firas. 2014. Language Independent Summarization Approaches. Innovative Document Summarization Techniques: Revolutionizing Knowledge Understanding 2014. LINA Nantes-University, France. DOI: 10.4018/978-1-4666-5019-0.ch013Google ScholarCross Ref
- Ahmed Abdelfattah Salehand Li Weigang. 2017. Language independent text summarization of western European languages using shape coding of text elements. 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) 2017. doi:10.1109/fskd.2017.8393116Google ScholarCross Ref
- Ahmad T. Al-Taani. 2017. Automatic text summarization approaches. International Conference on Infocom Technologies and Unmanned Systems, Trends and Future Directions (ICTUS). doi:10.1109/ictus.2017.8285983Google ScholarCross Ref
- Sergey Brin, Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine, In Computer Networks and ISDN Systems, Vol. 30. 1-7.Google ScholarDigital Library
- Rada Mihalcea. 2005. Language Independent Extractive Summarization. Annual Meeting of the Association for Computational Linguistics.Google Scholar
- Rada Mihalcea. 2004. Graph-based ranking algorithms for sentence extraction, applied to text summarization. Proceedings of the ACL 2004 on Interactive poster and demonstration sessions.Google ScholarDigital Library
- Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Text. Proceedings of the Conference on Empirical Methods on Natural Language Processing(EMNLP), 404-411 (2004).Google Scholar
- Ramesh Chandra Belwal, Sawan Rai, and Atul Gupta. 2020. A new graph-based extractive text summarization using keywords or topic modelling. Journal of Ambient Intelligence and Humanized Computing. Springer-Verlag GmbH Germany, part of Springer Nature 2020. 8975–8990. https://doi.org/10.1007/s12652-020-02591-x.Google ScholarCross Ref
- Jovi D'Silva and Uzzal Sharma. 2022. Explorations in Graph-based Ranking Algorithms for Automatic Text Summarization on Konkani Texts. International Conference on Sustainable Advanced Computing- ICSAC 2021. Sustainable Advanced Computing. Chapter 4. doi: 10.1007/978-981-16-9012-9_4.Google ScholarCross Ref
- Jovi D'Silva and Uzzal Sharma. 2023. Impact of Similarity Measures in Graph-based Automatic Text Summarization of Konkani Texts. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 22, 2, Article 51 (Feb). https://doi.org/10.1145/3554943Google ScholarDigital Library
- Ricardo Campos, Vitor Mangaravite, Arian Pasquali, Alipio Jorge, Celia Nunes, and Adam Jatowt. 2020. YAKE! Keyword Extraction from Single Documents using Multiple Local Features. Information Sciences Journal, Elsevier. Vol. 509. 257-289. ISSN 0020-0255. https://doi.org/10.1016/j.ins.2019.09.013.Google ScholarDigital Library
- Chin-Yew Lin. 2004. ROUGE: a Package for Automatic Evaluation of Summaries. Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), Barcelona, Spain. 74–81. (July).Google Scholar
- Jovi D'Silva and Uzzal Sharma. 2019. Development of a Konkani Language Dataset for Automatic Text Summarization and its Challenges. International Journal of Engineering Research and Technology, International Research Publication House. Vol 12. No 10. ISSN 0974-3154. 18913-18917.Google Scholar
- Statement –4: Distribution of Population by Schedule and Other Languages India, States and Union Territories – 2011, Office of the Registrar General & Census Commissioner, India, Ministry of Home Affairs, Government of India, 2011, https://censusindia.gov.in/2011Census/Language-2011/Statement-4.pdf.Google Scholar
- Gunes Erkan and Dragomir R. Radev. 2004. Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22. 457-479. (Jul).Google ScholarCross Ref
- Khushboo Thakkar, Rajiv V. Dharaskar, and Manoj Chandak. 2010. Graph-Based Algorithms for Text Summarization. 3rd International Conference on Emerging Trends in Engineering & Technology. International Conference. 516-519. doi: 10.1109/ICETET.2010.104.Google ScholarDigital Library
- Shanmugasundaram Hariharan and Srinivasan Rengaramanujam. 2010. Enhancements to Graph Based Methods for Single Document Summarization. IACSIT International Journal of Engineering and Technology. 2. No.1. ISSN: 1793-8236 (Feb).Google Scholar
- Nitin Agrawal, Shikhar Sharma, Prashant Sinha, and Shobha Bagai. 2015. A Graph Based Ranking Strategy for Automated Text Summarization. DU Journal of Undergraduate Research and Innovation (Feb).Google Scholar
- Vimal Kumar K, Divakar Yadav and Arundhati Sharma. 2015. Graph Based Technique for Hindi Text Summarization. In: Mandal, J., Satapathy, S., Kumar Sanyal, M., Sarkar, P., Mukhopadhyay, A. (eds) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 339. Springer 2015, New Delhi. https://doi.org/10.1007/978-81-322-2250-7_29.Google ScholarCross Ref
- Kanitha D K, D. Muhammad Noorul Mubarak, and S. A. Shanavas . 2018. Malayalam Text Summarization Using Graph Based Method. (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 9. 2. 40-44.Google Scholar
- Vaishali Sarwadnya and Sheetal Sonawane. 2018. Marathi Extractive Text Summarizer Using Graph Based Model. Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). doi:10.1109/iccubea.2018.8697741.Google ScholarCross Ref
- K. Usha Manjari. 2020. Extractive Summarization of Telugu Documents using TextRank Algorithm. 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC). 678-683. doi: 10.1109/I-SMAC49090.2020.9243568.[1 (2020).Google ScholarCross Ref
- Kishore Kumar Mamidala and Suresh Kumar Sanampud. 2021. A Heuristic Approach for Telugu Text Summarization with Improved Sentence Ranking. Turkish Journal of Computer and Mathematics Education. Vol.12 No.3, 4238-4243.Google Scholar
- Hans Peter Luhn. 1958. The automatic creation of literature abstracts. IBM Journal of research and development 2, 2 (1958), 159–165.Google ScholarDigital Library
- Document Understanding Conferences (DUC), National Institute of Standards and Technology (NIST), Available: https://www-nlpir.nist.gov/projects/duc/guidelines/2002.html.Google Scholar
- Jovi D'Silva, Uzzal Sharma and Chaitali More. 2023. Automatic Text Summarization of Konkani Texts Using Latent Semantic Analysis. In: Gupta, D., Khanna, A., Bhattacharyya, S., Hassanien, A.E., Anand, S., Jaiswal, A. (eds) International Conference on Innovative Computing and Communications. Lecture Notes in Networks and Systems, Vol. 473. Springer, Singapore. doi: https://doi.org/10.1007/978-981-19-2821-5_37.Google ScholarCross Ref
- Josef Steinberger and Karel Ježek. 2004. Using Latent Semantic Analysis in Text Summarization and Summary Evaluation. Proceedings of the 7th International Conference ISIM.Google Scholar
- Mišo Belica. 2018. sumy. GitHub repository. Retrieved November 20, 2023 from https://github.com/miso-belica/sumy.Google Scholar
Index Terms
- Keyword Driven Language-Independent Low-Resource Graph-Based Automatic Text Summarization of Konkani Texts
Recommendations
Impact of Similarity Measures in Graph-based Automatic Text Summarization of Konkani Texts
Automatic text summarization is a popular area in Natural Language Processing and Machine Learning. In this work, we adopt a graph-based text summarization approach, using PageRank algorithm, for automatically summarizing Konkani text documents. Konkani ...
Learning bilingual word embedding for automatic text summarization in low resource language
AbstractStudies in low-resource languages have become more challenging with the increasing volume of texts in today's digital era. Also, the lack of labeled data and text processing libraries in a language further widens the research gap ...
Automatic Text Summarization Methods: A Comprehensive Review
AbstractText summarization is the process of condensing a long text into a shorter version by maintaining the key information and its meaning. Automatic text summarization can save time and helps in selecting the important and relevant sentences from the ...
Comments