skip to main content
10.1145/2682571.2797099acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
short-paper

Automatic Text Document Summarization Based on Machine Learning

Published: 08 September 2015 Publication History

Abstract

The need for automatic generation of summaries gained importance with the unprecedented volume of information available in the Internet. Automatic systems based on extractive summarization techniques select the most significant sentences of one or more texts to generate a summary. This article makes use of Machine Learning techniques to assess the quality of the twenty most referenced strategies used in extractive summarization, integrating them in a tool. Quantitative and qualitative aspects were considered in such assessment demonstrating the validity of the proposed scheme. The experiments were performed on the CNN-corpus, possibly the largest and most suitable test corpus today for benchmarking extractive summarization strategies.

References

[1]
D. W. Aha, D. Kibler, and M. K. Albert. Instance-based learning algorithms. Mach. Learn., 6(1):37--66, Jan. 1991.
[2]
L. Breiman. Random forests. Mach. Learn., 45(1):5--32, Oct. 2001.
[3]
N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. Smote: Synthetic minority over-sampling technique. J. Artif. Int. Res., 16(1):321--357, June 2002.
[4]
R. Ferreira, L. de Souza Cabral, R. D. Lins, G. P. e Silva, F. Freitas, G. D. Cavalcanti, R. Lima, S. J. Simske, and L. Favaro. Assessing sentence scoring techniques for extractive text summarization. Expert Systems with Applications, 40(14):5755--5764, 2013.
[5]
W. B. Frakes and R. Baeza-Yates, editors. Information Retrieval: Data Structures and Algorithms. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1992.
[6]
Y. Freund and R. E. Schapire. Experiments with a new boosting algorithm. In International Conference on Machine Learning, pages 148--156, 1996.
[7]
S. Haykin. Neural Networks: A Comprehensive Foundation. Prentice Hall PTR, Upper Saddle River, NJ, USA, 2nd edition, 1998.
[8]
G. H. John and P. Langley. Estimating continuous distributions in bayesian classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, UAI'95, pages 338--345, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc.
[9]
C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In M.-F. Moens and S. Szpakowicz, editors, Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pages 74--81, Barcelona, Spain, July 2004. Association for Computational Linguistics.
[10]
E. Lloret and M. Palomar. Compendium: a text summarisation tool for generating summaries of multiple purposes, domains, and genres. Natural Language Engineering, FirstView:1--40, 2012.
[11]
E. Lloret and M. Palomar. Text summarisation in progress: a literature review. Artif. Intell. Rev., 37(1):1--41, Jan. 2012.
[12]
A. Patel, T. Siddiqui, and U. S. Tiwary. A language independent approach to multilingual text summarization. In Large Scale Semantic Access to Content (Text, Image, Video, and Sound), RIAO '07, pages 123--132, Paris, France, France, 2007. LE CENTRE DE HAUTES ETUDES INTERNATIONALES D'INFORMATIQUE DOCUMENTAIRE.
[13]
C. Silva and B. Ribeiro. The importance of stop word removal on recall values in text categorization. In IJCNN 2003, volume 3, n/a, 2003.

Cited By

View all
  • (2024)CERVICAL PROPRIOCEPTION AND VESTIBULAR FUNCTIONS IN PATIENTS WITH NECK PAIN AND CERVICOGENIC HEADACHE: A COMPARATIVE STUDYJournal of Turkish Spinal Surgery10.4274/jtss.galenos.2024.75047(113-118)Online publication date: 8-Aug-2024
  • (2024)Effective Tool Augmented Multi-Agent Framework for Data AnalysisData Intelligence10.3724/2096-7004.di.2024.0013Online publication date: 17-Oct-2024
  • (2024)Empowering legal justice with AI: A reinforcement learning SAC-VAE framework for advanced legal text summarizationPLOS ONE10.1371/journal.pone.031262319:10(e0312623)Online publication date: 25-Oct-2024
  • Show More Cited By

Index Terms

  1. Automatic Text Document Summarization Based on Machine Learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DocEng '15: Proceedings of the 2015 ACM Symposium on Document Engineering
    September 2015
    248 pages
    ISBN:9781450333078
    DOI:10.1145/2682571
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 September 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. extractive features
    2. sentence scoring methods
    3. text summarization

    Qualifiers

    • Short-paper

    Conference

    DocEng '15
    Sponsor:
    DocEng '15: ACM Symposium on Document Engineering 2015
    September 8 - 11, 2015
    Lausanne, Switzerland

    Acceptance Rates

    DocEng '15 Paper Acceptance Rate 11 of 31 submissions, 35%;
    Overall Acceptance Rate 194 of 564 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)17
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)CERVICAL PROPRIOCEPTION AND VESTIBULAR FUNCTIONS IN PATIENTS WITH NECK PAIN AND CERVICOGENIC HEADACHE: A COMPARATIVE STUDYJournal of Turkish Spinal Surgery10.4274/jtss.galenos.2024.75047(113-118)Online publication date: 8-Aug-2024
    • (2024)Effective Tool Augmented Multi-Agent Framework for Data AnalysisData Intelligence10.3724/2096-7004.di.2024.0013Online publication date: 17-Oct-2024
    • (2024)Empowering legal justice with AI: A reinforcement learning SAC-VAE framework for advanced legal text summarizationPLOS ONE10.1371/journal.pone.031262319:10(e0312623)Online publication date: 25-Oct-2024
    • (2024)Features in extractive supervised single-document summarization: case of Persian newsLanguage Resources and Evaluation10.1007/s10579-024-09739-758:4(1073-1091)Online publication date: 8-May-2024
    • (2023)Resume Screening using Machine Learning2023 International Conference on Data Science, Agents & Artificial Intelligence (ICDSAAI)10.1109/ICDSAAI59313.2023.10452483(1-5)Online publication date: 21-Dec-2023
    • (2022)Semantic relation evaluation of data science articles using network of mention2022 IEEE Nigeria 4th International Conference on Disruptive Technologies for Sustainable Development (NIGERCON)10.1109/NIGERCON54645.2022.9803141(1-9)Online publication date: 17-Apr-2022
    • (2021)Summarization of legal documentsComputer Science Review10.1016/j.cosrev.2021.10038840:COnline publication date: 1-May-2021
    • (2020)An Automatic Text Summarization Method with the Concern of Covering Complete FormationRecent Advances in Computer Science and Communications10.2174/221327591266619071610534713:5(977-986)Online publication date: 5-Nov-2020
    • (2020)NLP based Machine Learning Approaches for Text Summarization2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)10.1109/ICCMC48092.2020.ICCMC-00099(535-538)Online publication date: Mar-2020
    • (2019)Query-oriented text summarization based on hypergraph transversalsInformation Processing and Management: an International Journal10.1016/j.ipm.2019.03.00356:4(1317-1338)Online publication date: 1-Jul-2019
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media