research-article

Keyword Driven Language-Independent Low-Resource Graph-Based Automatic Text Summarization of Konkani Texts

Authors:
Chaitali More

Fr. Agnel College of Arts & Commerce, India

Fr. Agnel College of Arts & Commerce, India

0000-0002-9685-1904
View Profile

,
Jovi D'Silva

Independent Researcher, India

Independent Researcher, India

0000-0002-7222-7111
View Profile

FIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval EvaluationDecember 2023Pages 51–57https://doi.org/10.1145/3632754.3632758

Published:12 February 2024Publication History

FIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation

Pages 51–57

ABSTRACT

Automatic text summarization has been widely researched for the most popularly spoken languages. There is a need to extend research to other less popular and low-resource languages as well. We explore automatic text summarization for Konkani language. It is a language spoken by a relatively small population of people in the state of Goa, India, and is a low-resource language. A low-resource language has limited language processing tools, which add to the challenges faced in the automatic text summarization process. This study aims at extending the popular graph-based language-independent approach using language-independent keywords with the help of YAKE. These keywords allow us to weight the edges in a fully connected undirected graph by not only considering the influence of two vertices (sentences) but also the sentences with the overall document, represented by the keywords. Variable thresholds of relevant keywords are examined along with altering a bias parameter. The impact of these keyword thresholds along with the bias parameter is examined on the system generated summary.

References

Narendra Andhale and Laxmi Bewoor. 2016. An overview of Text Summarization techniques. International Conference on Computing Communication Control and Automation (ICCUBEA), 12-13 (Aug). doi:10.1109/iccubea.2016.7860024.Google ScholarCross Ref
N. Moratanch and S. Chitrakala. 2017. A survey on extractive text summarization. International Conference on Computer, Communication and Signal Processing (ICCCSP), 10-11 (Jan). doi:10.1109/icccsp.2017.7944061Google ScholarCross Ref
Elena Lloret and Manuel Palomar. 2012. Text summarisation in progress: a literature review. Springer, Springer. 1–41.Google Scholar
Shohreh Rad Rahimi, Ali Toofanzadeh Mozhdehi, and Mohamad Abdolahi. 2017. An overview on extractive text summarization. IEEE 4th International Conference on Knowledge-Based Engineering and Innovation (KBEI). doi:10.1109/kbei.2017.8324874Google ScholarCross Ref
Hmida Firas. 2014. Language Independent Summarization Approaches. Innovative Document Summarization Techniques: Revolutionizing Knowledge Understanding 2014. LINA Nantes-University, France. DOI: 10.4018/978-1-4666-5019-0.ch013Google ScholarCross Ref
Ahmed Abdelfattah Salehand Li Weigang. 2017. Language independent text summarization of western European languages using shape coding of text elements. 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) 2017. doi:10.1109/fskd.2017.8393116Google ScholarCross Ref
Ahmad T. Al-Taani. 2017. Automatic text summarization approaches. International Conference on Infocom Technologies and Unmanned Systems, Trends and Future Directions (ICTUS). doi:10.1109/ictus.2017.8285983Google ScholarCross Ref
Sergey Brin, Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine, In Computer Networks and ISDN Systems, Vol. 30. 1-7.Google ScholarDigital Library
Rada Mihalcea. 2005. Language Independent Extractive Summarization. Annual Meeting of the Association for Computational Linguistics.Google Scholar
Rada Mihalcea. 2004. Graph-based ranking algorithms for sentence extraction, applied to text summarization. Proceedings of the ACL 2004 on Interactive poster and demonstration sessions.Google ScholarDigital Library
Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Text. Proceedings of the Conference on Empirical Methods on Natural Language Processing(EMNLP), 404-411 (2004).Google Scholar
Ramesh Chandra Belwal, Sawan Rai, and Atul Gupta. 2020. A new graph-based extractive text summarization using keywords or topic modelling. Journal of Ambient Intelligence and Humanized Computing. Springer-Verlag GmbH Germany, part of Springer Nature 2020. 8975–8990. https://doi.org/10.1007/s12652-020-02591-x.Google ScholarCross Ref
Jovi D'Silva and Uzzal Sharma. 2022. Explorations in Graph-based Ranking Algorithms for Automatic Text Summarization on Konkani Texts. International Conference on Sustainable Advanced Computing- ICSAC 2021. Sustainable Advanced Computing. Chapter 4. doi: 10.1007/978-981-16-9012-9_4.Google ScholarCross Ref
Jovi D'Silva and Uzzal Sharma. 2023. Impact of Similarity Measures in Graph-based Automatic Text Summarization of Konkani Texts. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 22, 2, Article 51 (Feb). https://doi.org/10.1145/3554943Google ScholarDigital Library
Ricardo Campos, Vitor Mangaravite, Arian Pasquali, Alipio Jorge, Celia Nunes, and Adam Jatowt. 2020. YAKE! Keyword Extraction from Single Documents using Multiple Local Features. Information Sciences Journal, Elsevier. Vol. 509. 257-289. ISSN 0020-0255. https://doi.org/10.1016/j.ins.2019.09.013.Google ScholarDigital Library
Chin-Yew Lin. 2004. ROUGE: a Package for Automatic Evaluation of Summaries. Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), Barcelona, Spain. 74–81. (July).Google Scholar
Jovi D'Silva and Uzzal Sharma. 2019. Development of a Konkani Language Dataset for Automatic Text Summarization and its Challenges. International Journal of Engineering Research and Technology, International Research Publication House. Vol 12. No 10. ISSN 0974-3154. 18913-18917.Google Scholar
Statement –4: Distribution of Population by Schedule and Other Languages India, States and Union Territories – 2011, Office of the Registrar General & Census Commissioner, India, Ministry of Home Affairs, Government of India, 2011, https://censusindia.gov.in/2011Census/Language-2011/Statement-4.pdf.Google Scholar
Gunes Erkan and Dragomir R. Radev. 2004. Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22. 457-479. (Jul).Google ScholarCross Ref
Khushboo Thakkar, Rajiv V. Dharaskar, and Manoj Chandak. 2010. Graph-Based Algorithms for Text Summarization. 3rd International Conference on Emerging Trends in Engineering & Technology. International Conference. 516-519. doi: 10.1109/ICETET.2010.104.Google ScholarDigital Library
Shanmugasundaram Hariharan and Srinivasan Rengaramanujam. 2010. Enhancements to Graph Based Methods for Single Document Summarization. IACSIT International Journal of Engineering and Technology. 2. No.1. ISSN: 1793-8236 (Feb).Google Scholar
Nitin Agrawal, Shikhar Sharma, Prashant Sinha, and Shobha Bagai. 2015. A Graph Based Ranking Strategy for Automated Text Summarization. DU Journal of Undergraduate Research and Innovation (Feb).Google Scholar
Vimal Kumar K, Divakar Yadav and Arundhati Sharma. 2015. Graph Based Technique for Hindi Text Summarization. In: Mandal, J., Satapathy, S., Kumar Sanyal, M., Sarkar, P., Mukhopadhyay, A. (eds) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 339. Springer 2015, New Delhi. https://doi.org/10.1007/978-81-322-2250-7_29.Google ScholarCross Ref
Kanitha D K, D. Muhammad Noorul Mubarak, and S. A. Shanavas . 2018. Malayalam Text Summarization Using Graph Based Method. (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 9. 2. 40-44.Google Scholar
Vaishali Sarwadnya and Sheetal Sonawane. 2018. Marathi Extractive Text Summarizer Using Graph Based Model. Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). doi:10.1109/iccubea.2018.8697741.Google ScholarCross Ref
K. Usha Manjari. 2020. Extractive Summarization of Telugu Documents using TextRank Algorithm. 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC). 678-683. doi: 10.1109/I-SMAC49090.2020.9243568.[1 (2020).Google ScholarCross Ref
Kishore Kumar Mamidala and Suresh Kumar Sanampud. 2021. A Heuristic Approach for Telugu Text Summarization with Improved Sentence Ranking. Turkish Journal of Computer and Mathematics Education. Vol.12 No.3, 4238-4243.Google Scholar
Hans Peter Luhn. 1958. The automatic creation of literature abstracts. IBM Journal of research and development 2, 2 (1958), 159–165.Google ScholarDigital Library
Document Understanding Conferences (DUC), National Institute of Standards and Technology (NIST), Available: https://www-nlpir.nist.gov/projects/duc/guidelines/2002.html.Google Scholar
Jovi D'Silva, Uzzal Sharma and Chaitali More. 2023. Automatic Text Summarization of Konkani Texts Using Latent Semantic Analysis. In: Gupta, D., Khanna, A., Bhattacharyya, S., Hassanien, A.E., Anand, S., Jaiswal, A. (eds) International Conference on Innovative Computing and Communications. Lecture Notes in Networks and Systems, Vol. 473. Springer, Singapore. doi: https://doi.org/10.1007/978-981-19-2821-5_37.Google ScholarCross Ref
Josef Steinberger and Karel Ježek. 2004. Using Latent Semantic Analysis in Text Summarization and Summary Evaluation. Proceedings of the 7th International Conference ISIM.Google Scholar
Mišo Belica. 2018. sumy. GitHub repository. Retrieved November 20, 2023 from https://github.com/miso-belica/sumy.Google Scholar

Index Terms

Keyword Driven Language-Independent Low-Resource Graph-Based Automatic Text Summarization of Konkani Texts
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Summarization

Recommendations

Impact of Similarity Measures in Graph-based Automatic Text Summarization of Konkani Texts
Automatic text summarization is a popular area in Natural Language Processing and Machine Learning. In this work, we adopt a graph-based text summarization approach, using PageRank algorithm, for automatically summarizing Konkani text documents. Konkani ...
Read More
Learning bilingual word embedding for automatic text summarization in low resource language
Abstract
Studies in low-resource languages have become more challenging with the increasing volume of texts in today's digital era. Also, the lack of labeled data and text processing libraries in a language further widens the research gap ...
Read More
Automatic Text Summarization Methods: A Comprehensive Review
Abstract
Text summarization is the process of condensing a long text into a shorter version by maintaining the key information and its meaning. Automatic text summarization can save time and helps in selecting the important and relevant sentences from the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
FIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation
December 2023
170 pages
ISBN:9798400716324
DOI:10.1145/3632754
Editors:
Debasis Ganguly,
Srijoni Majumdar,
Bhaskar Mitra,
Parth Gupta,
Surupendu Gangopadhyay,
Prasenjit Majumder
Copyright © 2023 ACM
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 February 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Konkani
YAKE
automatic text summarization
graph-based
language-independent
low-resource
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate19of64submissions,30%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 10
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Keyword Driven Language-Independent Low-Resource Graph-Based Automatic Text Summarization of Konkani Texts

FIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Impact of Similarity Measures in Graph-based Automatic Text Summarization of Konkani Texts

Learning bilingual word embedding for automatic text summarization in low resource language

Automatic Text Summarization Methods: A Comprehensive Review

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Keyword Driven Language-Independent Low-Resource Graph-Based Automatic Text Summarization of Konkani Texts

FIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Impact of Similarity Measures in Graph-based Automatic Text Summarization of Konkani Texts

Learning bilingual word embedding for automatic text summarization in low resource language

Automatic Text Summarization Methods: A Comprehensive Review

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media